Center for Nonlinear Studies

Tuesday, May 15, 201810:30 AM - 12:00 PMCNLS Conference Room (TA-3, Bldg 1690)
Smart Grid
Importance sampling for the power grid and correlated false discoveries in genomics
Art OwenStanford University
Here we estimate the probability of a union of J rare events defined in terms of a random variable x. The algorithm begins by picking event j with probability proportional to its individual occurence probability. We then sample x conditionally on that event happening, and count the total number S(x) of events that happen. The estimate of is then the union bound times the average of S(x)􀀀1 over n repeated trials. This importance sampler has been used by Frigessi & Vercellis (1985) for combinatorial enumeration, Naiman & Priebe (2001) for scan statistics in ge- nomics and medical imaging, Shi, Siegmund & Yakir (2007) for linkage analy-sis, and Adler, Blannchet & Liu (2012) for exceedance probabilities of Guassian random fields. The literature does not name it. It always has At Least One rare Event, so we call it ALOE. We and upper bounds on the variance of the ALOE importance sampler. It always has var(^) ( 􀀀 )=n. It also has var(^) (J + J􀀀1 􀀀 2)=(4n). We consider power system reliability, where the phase differences between connected nodes have a joint Gaussian distribution and the J rare events arise from unacceptably large phase differences. In the grid reliability problems even some events defined by 5772 constraints in 326 dimensions, with probability below 10􀀀22, are estimated with a coefficient of variation of about 0:0024 with only n = 10;000 sample values. The algorithm extends beyond estimation of . We also use this sampler in a genomics setting. False discoveries in genomics are usually modeled as independent events. For instance the Benjamini- Hochberg procedure is defined that way. Unfortunately the genomics setting has highly correlated test statistics causing false discoveries to come in bursts. We use ALOE to estimate the distribution of the number of false discoveries for a Gaussian phenotype under a null model.

Host: Michael Chertkov