Lab Home | Phone | Search | ||||||||
|
||||||||
Building cortex-like visual representations is a long-standing goal of computational vision. Following the architecture of visual cortex but emphasizing feedback processes as generators of semantically informed, locally self-consistent image predictions, I will describe a hierarchical sparse coding model to develop cortex-like feature representations. The approach is a generalization of generative models with sparse constraint, from primary visual cortex (V1) to a hierarchy of (deep hidden) cortical layers, corresponding to visual areas V2, V4, and IT in the primate ventral pathway. The Bayesian framework is utilized to address visual inference in the hierarchical structure, where each cortical area is an expert for inferring certain aspects of the visual scene, constrained by the bottom-up data from the feed-forward connections and the top-down data from feedback connections. An optimized continuation method is adopted to iteratively search a converged solution with high efficiency. This hierarchical sparse coding model is applied to natural images, and develops internal presentation that matches the neuroscience findings in primate visual cortex. The primary visual cortex, V1, presented an over-complete set of Gabor-like filters, while higher layers in the ventral pathway contains more complex features than V1. A degree of visual invariance regarding objects are emergent via local pooling of the hierarchical presentations. Using the benchmark object identification data sets like Caltech 101, our new systems-level computational model is able to generate hierarchical internal representation better than the SIFT-based approach and convolutional networks, etc. Host: Luis Bettencourt, 667-8453 lmbett@lanl.gov |