Center for Nonlinear Studies

Wednesday, August 16, 20173:00 PM - 4:00 PMCNLS Conference Room (TA-3, Bldg 1690)
Seminar
Large Scale Distribution-based Data Analysis and Visualization
Han-Wei ShenThe Ohio State University
Scientists overview and identify regions of interest by transforming data into compact information descriptors that characterize simulation results and allow detailed analysis on demand. Among many existing feature descriptors, statistical information derived from data samples is a promising approach to taming the big data avalanche because data distributions computed from a population can compactly describe the presence and characteristics of salient data features with minimal data movement. The ability to computationally summarize and process data using distributions also provides an efficient and representative capture of information that can adjust to size and resource constraints, with the added benefit that uncertainty associated with the results can be quantified and communicated. In this talk, I will discuss our recent works on using distributions as a new paradigm for representing large scale scientific data sets. Our goals are to ensure that scientists can easily obtain an overview of the entire data set regardless of the size of the simulation output; understand the characteristics and locations of features; easily interact with the data and select regions and features of interest; and perform all the analysis tasks with a small memory footprint.

Host: Curt Canada, 505-665-7453, cvc@lanl.gov