Why do we see woods as woods? What makes up the visual impression of a hill
scene? What exactly do we visually perceive of the world? ...and can this be
perceived by a computer as well? Can you calculate your perception?
My research addresses the understanding of the visual appearance of
scenes and objects, such as to assess these questions from a computational
perspective. Physical, biological, and geological constraints are reflected in
images. By studying intrinsic image properties and the image statistics,
I target to make such constraints explicit. With the recent explosion in
available digital photo collections, exploring and exploiting natural image
statistics becomes a great possibility.
Modelling scene appearance at a detailed level of understanding has much
potential to reveal many aspects and mechanisms of visual perception
for machine cognition.
Published June 8th, 2011

I have founded the company PercepTech which will focus on computer vision research, consultancy, and prototype development. The company is active in screening of security papers and in the area of high profile CCTV.
Besides that, the company has extensive experience with inspection vision applications, where reliability and real time processing is mandatory.
Published August 6th, 2009

Our paper Stages as models for scene geometry by Vladimir Nedovic, Arnold W.M. Smeulders, Andre Redert (Philips Research),
and myself is accepted for publication in
IEEE Transactions on Pattern Analysis and Machine Intelligence.
In this paper we study how to roughly estimate scene geometry from single images.
Reconstruction of 3D scene geometry is an important element for scene understanding.
We show that depth estimation can be approached as a catagorisation problem, where the scene is assigned to 15 typical 3D
geometry models, called stages, each with a characteristic depth profile.
This categorisation roughly corresponds to a large majority of
occuring geometries in images. We show how changes in camera viewing direction and zooming converts the geometry from one stage into another,
covering the continuum of possible camera motions. Furthermore, we show that, in principle, these stages can be derived from statistical
properties of the image content. Hence, statistical machine learning techniques can be used to determine which stage best approximates the
image content geometry.
Our approach of stage categorisation serves as the first approximation of global depth, narrowing down the search space in precise depth
estimation and object localization.
The paper is
available from here.
Published July 15th, 2009
Rui Lu and
Arjan Gijsenij's paper
Color Constancy Using 3D Stage Geometry co-authored by
Theo Gevers, De Xu,
Vladimir Nedovic,
and myself is accepted at the
IEEE International Conference on Computer Vision, Kyoto, Japan, Sept 29th - Oct 2nd.
The aim of color constancy is to remove the effect of the color of the light source. As color constancy is inherently an ill-posed problem,
most of the existing color constancy algorithms are based on specific imaging assumptions such as the grey-world and white patch assumptions.
In the paper, we apply 3D geometry models to determine which color constancy method to use for the different geometrical regions found in
images. To this end, images are first classified into stages (rough 3D geometry models), see the work of
Vladimir Nedovic
in
PAMI.
According to the stage models, images are divided
into different regions using hard and soft segmentation. After that, the best color constancy algorithm is selected for each geometry
segment. As a result, light source estimation is tuned to the global scene geometry. Our algorithm opens the possibility to estimate the
remote scene illumination color, by distinguishing nearby light source from distant illuminants. Experiments on large scale image datasets
show that the proposed algorithm outperforms state-of-the-art single color constancy algorithms with an improvement of almost 14% of median
angular error. When using an ideal classifier (i.e, all of the test images are correctly classified into stages), the performance of the
proposed method achieves an improvement of 31% of median angular error compared to the best-performing single color constancy algorithm.
The paper is
available from here.
Published June 4th, 2009

The paper of my Ph.D. student
Jan van Gemert,
Visual Word Ambiguity,
co-authored by
Cor Veenman,
Arnold W.M. Smeulders,
and myself is accepted for publication in
IEEE Transactions on Pattern Analysis and Machine Intelligence.
In this paper we study several option for soft assignment in the popular visual vocabulary or code-book approach to visual scene and
object categorisation. The codebook model describes an
image as a bag of discrete visual words selected from a vocabulary, where the frequency distributions of visual words in an image allow
classification. One inherent component of the codebook model is the assignment of discrete visual words to continuous image features. Despite
the clear mismatch of this hard assignment with the nature of continuous visual features,
the approach has been applied successfully for some years.
We have investigated four types of soft-assignment of visual words to image features. We demonstrate that explicitly modeling visual
word assignment ambiguity improves classification performance compared to the hard-assignment of the traditional codebook model. The
traditional codebook model is compared against our method for five well-known datasets: 15 natural scenes, Caltech-101, Caltech-256, and
Pascal VOC 2007/2008. We demonstrate that large codebook vocabulary sizes completely deteriorate the performance of the traditional model,
whereas the proposed model performs consistently. Moreover, we show that our method profits in high-dimensional feature spaces and reaps
higher benefits when increasing the number of image categories.
The paper is
available from here.