Lab Home | Phone | Search
Center for Nonlinear Studies  Center for Nonlinear Studies
 Home 
 People 
 Current 
 Executive Committee 
 Postdocs 
 Visitors 
 Students 
 Research 
 Publications 
 Conferences 
 Workshops 
 Sponsorship 
 Talks 
 Seminars 
 Postdoc Seminars Archive 
 Quantum Lunch 
 Quantum Lunch Archive 
 P/T Colloquia 
 Archive 
 Ulam Scholar 
 
 Postdoc Nominations 
 Student Requests 
 Student Program 
 Visitor Requests 
 Description 
 Past Visitors 
 Services 
 General 
 
 History of CNLS 
 
 Maps, Directions 
 CNLS Office 
 T-Division 
 LANL 
 
Wednesday, April 14, 2010
3:00 PM - 4:30 PM
CNLS Conference Room (TA-3, Bldg 1690)

Seminar

Epistemology of Small-Sample Classification

Edward R. Dougherty
Department of Electrical and Computer Engineering, Texas A&M

The accumulation of high-throughput genomic data has spawned a host of proposed gene-expression classifiers to discriminate between phenotypes, in particular, different types, stages, and prognoses for disease. Classical pattern recognition typically involved features possessing contextual meaning, such as geometric features in machine vision and character recognition, and samples that were large in comparison to the number of features. On the other hand, genomic features have generally not depended on biological understanding and the number of features has been extraordinarily large in comparison to sample size. This situation obviously calls for the development of the relevant small-sample theory; however, there has been little effort to understand and address the epistemological issues created by the reversal of the classical paradigm. The consequence is a large number of published papers demonstrably lacking scientific validity and no rigorous scientific road ahead to realize the potential of molecular-based diagnosis and prognosis. This talk discusses the issue of validity in classification, reviews the extensive epistemological failings over the last decade, and proposes an epistemologically sound path ahead based on extending the methods of classical mathematical statistics into the current high-throughput environment.

Host: Garrett Kenyon, gkenyon@lanl.gov