Lab Home | Phone | Search
Center for Nonlinear Studies  Center for Nonlinear Studies
 Home 
 People 
 Current 
 Affiliates 
 Alumni 
 Visitors 
 Students 
 Research 
 ICAM-LANL 
 Publications 
 2007 
 2006 
 2005 
 2004 
 2003 
 2002 
 2001 
 2000 
 <1999 
 Conferences 
 Workshops 
 Sponsorship 
 Talks 
 Colloquia 
 Seminars 
 Quantum Lunch 
 CMS Colloquia 
 Archive 
 Kac Lectures 
 Dist. Quant. Lecture 
 Ulam Scholar 
 Colloquia 
 
 Jobs 
 Students 
 Summer Research 
 Graduate Positions 
 Visitors 
 Description 
 Services 
 General 
 PD Travel Request 
 
 History of CNLS 
 
 Maps, Directions 
 CNLS Office 
 T-Division 
 LANL 
 
Wednesday, April 14, 2010
3:00 PM - 4:30 PM
CNLS Conference Room (TA-3, Bldg 1690)

Seminar

Epistemology of Small-Sample Classification

Edward R. Dougherty
Department of Electrical and Computer Engineering, Texas A&M

The accumulation of high-throughput genomic data has spawned a host of proposed gene-expression classifiers to discriminate between phenotypes, in particular, different types, stages, and prognoses for disease. Classical pattern recognition typically involved features possessing contextual meaning, such as geometric features in machine vision and character recognition, and samples that were large in comparison to the number of features. On the other hand, genomic features have generally not depended on biological understanding and the number of features has been extraordinarily large in comparison to sample size. This situation obviously calls for the development of the relevant small-sample theory; however, there has been little effort to understand and address the epistemological issues created by the reversal of the classical paradigm. The consequence is a large number of published papers demonstrably lacking scientific validity and no rigorous scientific road ahead to realize the potential of molecular-based diagnosis and prognosis. This talk discusses the issue of validity in classification, reviews the extensive epistemological failings over the last decade, and proposes an epistemologically sound path ahead based on extending the methods of classical mathematical statistics into the current high-throughput environment.

Host: Garrett Kenyon, gkenyon@lanl.gov