Lab Home | Phone | Search
Center for Nonlinear Studies  Center for Nonlinear Studies
 Home 
 People 
 Current 
 Affiliates 
 Visitors 
 Students 
 Research 
 ICAM-LANL 
 Publications 
 Conferences 
 Workshops 
 Sponsorship 
 Talks 
 Colloquia 
 Colloquia Archive 
 Seminars 
 Postdoc Seminars Archive 
 Quantum Lunch 
 Quantum Lunch Archive 
 CMS Colloquia 
 Q-Mat Seminars 
 Q-Mat Seminars Archive 
 P/T Colloquia 
 Archive 
 Kac Lectures 
 Kac Fellows 
 Dist. Quant. Lecture 
 Ulam Scholar 
 Colloquia 
 
 Jobs 
 Postdocs 
 CNLS Fellowship Application 
 Students 
 Student Program 
 Visitors 
 Description 
 Past Visitors 
 Services 
 General 
 
 History of CNLS 
 
 Maps, Directions 
 CNLS Office 
 T-Division 
 LANL 
 
Wednesday, October 12, 2016
2:00 PM - 3:00 PM
CNLS Conference Room (TA-3, Bldg 1690)

Seminar

Performance Monitoring and Dynamic Adaptation in HPX - A Task-based Runtime System

Patricia Grubel
New Mexico State University

As parallel computation enters the exascale era where applications may run on millions to billions of processors concurrently, all aspects of the computational model need to undergo a transformation to meet the challenges of scaling impaired applications. One class of models aimed towards exascale computation is the task-based parallel computational model. Task-based execution models and their implementations aim to support parallelism through massive multi-threading where an application is split into numerous tasks of varying size that execute concurrently. In task-based systems, scheduling tasks onto resources can incur large overheads that vary with the underlying hardware. In this work, the goal is to dynamically control task grain size to minimize these overheads. Performance studies are used to determine overheads and metrics to use for dynamically tuning task granularity for improved performance. The performance studies and ensuing dynamic adaptation example use HPX, the first implementation of the ParalleX execution model. HPX is a C++ general runtime system that employs asynchronous fine-grained tasks and asynchronous communication for parallel and distributed applications. The performance studies give an understanding of task scheduling overheads and indication of task duration when scaling on both multi-core (Intel Ivy Bridge systems) and many-core (Xeon Phi) systems. The knowledge gained can be applied to understanding loss of efficiency in parallel and distributed applications due to both task scheduling and overheads caused by parallelization on the underlying hardware.

Host: Louis Vernon