Lab Home | Phone | Search
Center for Nonlinear Studies  Center for Nonlinear Studies
 Colloquia Archive 
 Postdoc Seminars Archive 
 Quantum Lunch 
 Quantum Lunch Archive 
 CMS Colloquia 
 Q-Mat Seminars 
 Q-Mat Seminars Archive 
 P/T Colloquia 
 Kac Lectures 
 Kac Fellows 
 Dist. Quant. Lecture 
 Ulam Scholar 
 CNLS Fellowship Application 
 Student Program 
 Past Visitors 
 History of CNLS 
 Maps, Directions 
 CNLS Office 
Monday, November 09, 2009
10:00 AM - 11:00 AM
CNLS Conference Room (TA-3, Bldg 1690)


High Efficiency Model Identification for Statistical Graphical Models

Terran Lane
University of New Mexico

Statistical graphical models, such as Bayesian networks, Markov random fields, or factor graphs, have become increasingly important for data modeling in fields as diverse as economics, ecology, physics, neuroscience, computer vision, and genomics. These models are attractive because they efficiently and compactly represent the often complex probability distributions that arise in such fields, and because they provide semantically rich models to domain scientists. However, often the graph structure of the target model is initially unknown -- indeed, in many cases the model structure is, itself, the quantity of interest to the domain scientist. The problem of structure identification (or model selection, if you prefer) remains a prominent open question in this field. Statistically, the problem is one of identifying (conditional, multivariate) dependencies from data, and is reasonably well understood. Computationally, however, the task is quite challenging: Worst case analysis shows that optimal structure identification is NP-hard, while practical algorithms are typically high-order polynomial runtime and may require many scans over the complete data in order to accumulate sufficient statistics. In this talk, we present preliminary work on a new approach to structure identification that exploits the topology of graph structure space itself. The key insight is that we need not compute the exact optimality criterion for every model we evalute during search, if we can approximate it well. And building good function approximators is precisely what Machine Learning and Statistics are very good at... We demonstrate how to use this insight to build a structure search algorithm that runs orders of magnitude faster than conventional approaches, while achieving results that are at least as good, if not better. We give preliminary data demonstrating our approach on a number of synthetic and real-world data sets, including some challenging neuroimaging data sets that involve hidden variables.

Host: Joshua Neil,