BMVC 2004, Kingston, 7th-9th Sept, 2004

Attending, Foveating and Recognizing Objects in Real World Scenes
M. Bjorkman and J-O. Eklundh (Royal Institute of Technology, Sweden)

Recognition in cluttered real world scenes is a challenging problem. To find
a particular object of interest within a reasonable time, a wide field of view
is preferable. However, as we will show with practical experiments, robust
recognition is easier if the object is foveated and subtends a considerable part
of the visual field. In this paper a binocular system able to overcome these
two conflicting requirements will be presented. The system consists of two
sets of cameras, a wide field pair and a foveal one. From disparities a number
of object hypotheses are generated. An attentional process based on hue and
3D size guides the foveal cameras towards the most salient regions. With the
object foveated and segmented in 3D, recognition is performed using scale
invariant features. The system is fully automised and runs at real-time speed.
