Why do we see woods as woods? What makes up the visual impression of a hill scene? What exactly do we visually perceive of the world? ...and can this be perceived by a computer as well? Can you calculate your perception? My research addresses the understanding of the visual appearance of scenes and objects, such as to assess these questions from a computational perspective. Physical, biological, and geological constraints are reflected in images. By studying intrinsic image properties and the image statistics, I target to make such constraints explicit. With the recent explosion in available digital photo collections, exploring and exploiting natural image statistics becomes a great possibility. Modelling scene appearance at a detailed level of understanding has much potential to reveal many aspects and mechanisms of visual perception for machine cognition.

PercepTech company founded

Published June 8th, 2011

PercepTech

I have founded the company PercepTech which will focus on computer vision research, consultancy, and prototype development. The company is active in screening of security papers and in the area of high profile CCTV. Besides that, the company has extensive experience with inspection vision applications, where reliability and real time processing is mandatory.

Stages as Models for Scene Geometry

Published August 6th, 2009

stages

Our paper Stages as models for scene geometry by Vladimir Nedovic, Arnold W.M. Smeulders, Andre Redert (Philips Research), and myself is accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence. In this paper we study how to roughly estimate scene geometry from single images. Reconstruction of 3D scene geometry is an important element for scene understanding. We show that depth estimation can be approached as a catagorisation problem, where the scene is assigned to 15 typical 3D geometry models, called stages, each with a characteristic depth profile. This categorisation roughly corresponds to a large majority of occuring geometries in images. We show how changes in camera viewing direction and zooming converts the geometry from one stage into another, covering the continuum of possible camera motions. Furthermore, we show that, in principle, these stages can be derived from statistical properties of the image content. Hence, statistical machine learning techniques can be used to determine which stage best approximates the image content geometry. Our approach of stage categorisation serves as the first approximation of global depth, narrowing down the search space in precise depth estimation and object localization. The paper is available from here.

Color Constancy using 3D Stage Geometry

Published July 15th, 2009

Rui Lu and Arjan Gijsenij's paper Color Constancy Using 3D Stage Geometry co-authored by Theo Gevers, De Xu, Vladimir Nedovic, and myself is accepted at the IEEE International Conference on Computer Vision, Kyoto, Japan, Sept 29th - Oct 2nd. The aim of color constancy is to remove the effect of the color of the light source. As color constancy is inherently an ill-posed problem, most of the existing color constancy algorithms are based on specific imaging assumptions such as the grey-world and white patch assumptions. In the paper, we apply 3D geometry models to determine which color constancy method to use for the different geometrical regions found in images. To this end, images are first classified into stages (rough 3D geometry models), see the work of Vladimir Nedovic in PAMI. According to the stage models, images are divided into different regions using hard and soft segmentation. After that, the best color constancy algorithm is selected for each geometry segment. As a result, light source estimation is tuned to the global scene geometry. Our algorithm opens the possibility to estimate the remote scene illumination color, by distinguishing nearby light source from distant illuminants. Experiments on large scale image datasets show that the proposed algorithm outperforms state-of-the-art single color constancy algorithms with an improvement of almost 14% of median angular error. When using an ideal classifier (i.e, all of the test images are correctly classified into stages), the performance of the proposed method achieves an improvement of 31% of median angular error compared to the best-performing single color constancy algorithm. The paper is available from here.

Visual Word Ambiguity

Published June 4th, 2009

soft assignment

The paper of my Ph.D. student Jan van Gemert, Visual Word Ambiguity, co-authored by Cor Veenman, Arnold W.M. Smeulders, and myself is accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence. In this paper we study several option for soft assignment in the popular visual vocabulary or code-book approach to visual scene and object categorisation. The codebook model describes an image as a bag of discrete visual words selected from a vocabulary, where the frequency distributions of visual words in an image allow classification. One inherent component of the codebook model is the assignment of discrete visual words to continuous image features. Despite the clear mismatch of this hard assignment with the nature of continuous visual features, the approach has been applied successfully for some years. We have investigated four types of soft-assignment of visual words to image features. We demonstrate that explicitly modeling visual word assignment ambiguity improves classification performance compared to the hard-assignment of the traditional codebook model. The traditional codebook model is compared against our method for five well-known datasets: 15 natural scenes, Caltech-101, Caltech-256, and Pascal VOC 2007/2008. We demonstrate that large codebook vocabulary sizes completely deteriorate the performance of the traditional model, whereas the proposed model performs consistently. Moreover, we show that our method profits in high-dimensional feature spaces and reaps higher benefits when increasing the number of image categories. The paper is available from here.