Lab Home | Phone | Search
Center for Nonlinear Studies  Center for Nonlinear Studies
 Home 
 People 
 Current 
 Affiliates 
 Visitors 
 Students 
 Research 
 ICAM-LANL 
 Publications 
 Conferences 
 Workshops 
 Sponsorship 
 Talks 
 Colloquia 
 Colloquia Archive 
 Seminars 
 Postdoc Seminars Archive 
 Quantum Lunch 
 Quantum Lunch Archive 
 CMS Colloquia 
 Q-Mat Seminars 
 Q-Mat Seminars Archive 
 P/T Colloquia 
 Archive 
 Kac Lectures 
 Kac Fellows 
 Dist. Quant. Lecture 
 Ulam Scholar 
 Colloquia 
 
 Jobs 
 Postdocs 
 CNLS Fellowship Application 
 Students 
 Student Program 
 Visitors 
 Description 
 Past Visitors 
 Services 
 General 
 
 History of CNLS 
 
 Maps, Directions 
 CNLS Office 
 T-Division 
 LANL 
 
Wednesday, April 14, 2010
3:00 PM - 4:30 PM
CNLS Conference Room (TA-3, Bldg 1690)

Seminar

Epistemology of Small-Sample Classification

Edward R. Dougherty
Department of Electrical and Computer Engineering, Texas A&M

The accumulation of high-throughput genomic data has spawned a host of proposed gene-expression classifiers to discriminate between phenotypes, in particular, different types, stages, and prognoses for disease. Classical pattern recognition typically involved features possessing contextual meaning, such as geometric features in machine vision and character recognition, and samples that were large in comparison to the number of features. On the other hand, genomic features have generally not depended on biological understanding and the number of features has been extraordinarily large in comparison to sample size. This situation obviously calls for the development of the relevant small-sample theory; however, there has been little effort to understand and address the epistemological issues created by the reversal of the classical paradigm. The consequence is a large number of published papers demonstrably lacking scientific validity and no rigorous scientific road ahead to realize the potential of molecular-based diagnosis and prognosis. This talk discusses the issue of validity in classification, reviews the extensive epistemological failings over the last decade, and proposes an epistemologically sound path ahead based on extending the methods of classical mathematical statistics into the current high-throughput environment.

Host: Garrett Kenyon, gkenyon@lanl.gov