| Lab Home | Phone | Search | ||||||||
|
||||||||
|
The sparse matrix-vector multiplication (SpMV) kernel is a key performance component of numerous algorithms in computational science. Despite the kernel’s apparent simplicity, the sparse and potentially irregular data access patterns of SpMV and its intrinsically low computational intensity haven been challenging the development of high-performance implementations over decades. Still these developments are rarely guided by appropriate performance models. This talk will report on recent advancements to boost the performance of SpMV with symmetric matrices and cache blocking in the computation of matrix power kernels (MPK) for sparse matrices. Reformulating the SpMV as a graph traversal problem as used by RACE [1] allows us to handle dependencies in parallelization and cache blocking in an hardware efficient way. On the compute node-level the RACE implementation of sparse MPK achieves speed-ups of up to 2x-5x compared to state-of-the art implementations [2]. Various numerical schemes like s-step Krylov solvers, polynomial preconditioners and power clustering algorithms may directly benefit from these developments. [1] Alappat, C. L. et al.: A Recursive Algebraic Coloring Technique for Hardware-efficient Symmetric Sparse Matrix-vector Multiplication. In: ACM TOPC 7 (2020), Article No.: 19. ISSN: 2329-4949. DOI: 10.1145/3399732 [2] Alappat, C.L. et al.: Level-Based Blocking for Sparse Matrices: Sparse Matrix-Power-Vector Multiplication. In: IEEE Transactions on Parallel and Distributed Systems, vol. 34, no. 2, pp. 581-597, 1 Feb. 2023, doi: 10.1109/TPDS.2022.3223512 Bio: Gerhard Wellein is a Professor for High Performance Computing at the Department for Computer Science of the Friedrich-Alexander-Universitat Erlangen-Nurnberg (FAU) and holds a PhD in theoretical physics from the University of Bayreuth. Since 2021 he is the director of the Erlangen National Center for High Performance Computing (NHR@FAU). Gerhard Wellein has more than twenty years of experience in teaching HPC techniques to students and scientists from computational science and engineering. His research interests focus on performance modelling and performance engineering, architecture-specific code optimization, novel parallelization approaches and hardware-efficient building blocks for sparse linear algebra and stencil solvers. He has been conducting and leading numerous HPC projects. Prospective Joint CNLS & IC-APT Colloquium speakers: please contact Anna Matsekh matsekh@lanl.gov to propose or to nominate a talk. We aim to bring presentations featuring both fundamental and applied research conducted utilizing Institutional Computing and HPC resources at LANL, across the DOE complex, and externally. We also welcome informal HPC-focused training sessions. Past Joint CNLS & IC-APT Colloquia presentations: https://ic-wiki.lanl.gov/Home Teams: Join the meeting now Meeting ID: 295 410 931 869 6 Passcode: 8SZ7zn3c Host: Avadh Saxena (T-4) | ||||||||