Center for Nonlinear Studies

Monday, August 24, 20091:30 PM - 2:30 PMCNLS Conference Room (TA-3, Bldg 1690)
Seminar
Scientific Computation using Multiple Graphics Processors
Blair PerotUniversity of Massachusetts at Amherst
Because many large scientific algorithms can not efficiently use a memory cache, large scale engineering and science calculations have experienced little in hardware performance improvements over the last decade. However, with the advent of programmable graphics processors four years ago, and a C++ graphics programming paradigm (CUDA) roughly a year ago, it is now possible make up for that deficit and obtain an average of ten to twenty times the calculation throughput of the CPU when solving large scientific problems.

The peculiarities of graphics processor (GPU) hardware and how it impacts scientific algorithm structure and performance is discussed. Examples are presented from a range of application domains including: partial differential equation solution, large sequence matching (bio-informatics), and graph traversal and manipulation. The challenges and possibilities of using many GPUs in an MPI-cluster environment are also presented along with the performance of a GPU-based desktop supercomputer with 1920 processing cores in a single PC.

Host: Mikhail Shashkov