Lab Home | Phone | Search
Center for Nonlinear Studies  Center for Nonlinear Studies
 Colloquia Archive 
 Postdoc Seminars Archive 
 Quantum Lunch 
 Quantum Lunch Archive 
 CMS Colloquia 
 Q-Mat Seminars 
 Q-Mat Seminars Archive 
 P/T Colloquia 
 Kac Lectures 
 Kac Fellows 
 Dist. Quant. Lecture 
 Ulam Scholar 
 CNLS Fellowship Application 
 Student Program 
 Past Visitors 
 History of CNLS 
 Maps, Directions 
 CNLS Office 
Monday, March 02, 2009
3:00 PM - 4:00 PM
CNLS Conference Room (TA-3, Bldg 1690)


Untangling Visual Object Recognition in the Brain

Jim DiCarlo
McGovern Institute for Brain Research and Dept. of Brain and Cognitive Sciences Massachusetts Institute of Technology

Although object recognition is fundamental to our behavior and seemingly effortless, it is a remarkably challenging computational problem because the visual system must somehow tolerate tremendous image variation produced by different views of each object (the “invariance” problem). In this talk, I will present a framework for thinking about that computational crux of object recognition and how it might be solved (“untangling” object manifolds). Our current neurophysiological evidence suggests that the primate brain accomplishes this untangling by gradually transforming its initial neuronal population representation (a photograph on the retina) to a new, explicit form of neuronal population representation at the highest level of the primate ventral visual stream (inferior temporal cortex, IT). We have recently discovered that unsupervised learning of naturally-occurring temporal contiguity cues in the visual environment can play a key role in constructing the untangling solution in IT.

The only way to know if such neuroscience results can explain visual recognition is to incorporate them into instantiated computational models. But the challenges are formidable: 1) neuroscience data do not fully constrain many of the important parameters (“details”) of such models, 2) the primate visual system operates at high dimensionality and with years of natural experience, and 3) the community lacks well-defined methods of assessing the progress of such models. To approach these problems, we and our collaborators are leveraging recent advances in stream processing hardware (high-end GPUs and the Playstation 3's CellProcessor). In analogy to high-throughput screening approaches in molecular biology, we are screening among thousands of network architectures using appropriate recognition benchmarks. We found that this approach gives reproducible gains in recognition performance and it can offer insight into which model parameters are most important. As available computational power continues to expand and new neuroscience data are acquired, this approach has the potential to greatly accelerate our understanding of how the visual system accomplishes object recognition.

Host: Vadas Gintautas, T-4/CNLS