Humans and robots need to perceive in order to act in the world. Perception is an important problem that sits between the areas of computational vision, artificial intelligence, and cognitive science. Most work in computational vision has focussed on low-level perceptual problems such as scene reconstruction and motion estimation. Most work in AI has treated perception as a solved problem. Finally, research in cognitive science has provided general models of perception, but these have not been sufficiently precise to implement computationally.
My long term research goal is to develop a rigorous computational framework for visual perception that leads to effective tools for tackling a wide range of problems in computational vision. In particular, this research seeks to answer two fundamental questions in perception. First, what computational machinery is necessary to build an artificial (computational) perceptual system. Second, what type of world knowledge is necessary for a perceptual system to ``understand'' a particular visual domain.
One branch of my research focuses on a specific perception problem -- the analysis and recognition of simple spatial motion events (e.g. pickup/putdown, throw/catch, etc.) in video sequences. My approach is based on the premise that common sense physical knowledge provides a critical knowledge base for understanding many simple events. In this research I use simple ``force-dynamic'' descriptions of the scene (e.g. which objects are in contact, which are attached, which can exert forces and torques, etc.) as a basis for an event recognition system. In addition to providing a domain to demonstrate our computational ideas, event recognition is an interesting problem in its own rite. There are a wide range of potential applications of this type of analysis, including human-computer interaction, surveillance and monitoring, robotics, and video indexing.
A second branch of my research is to study the general machinery needed for perception. Perception can be viewed as a process of ``unconscious inference'' in which the perceiver finds interpretation(s) of the world that best explain the image observations. To understand this process I am exploring the connection between perception and Bayesian inference. A Bayesian analysis is appropriate since it provides a mathematically precise representation of uncertainty which allows the perceiver to determine which inferences are justified.