TY - GEN
T1 - Enabling Call Path Querying in Hatchet to Identify Performance Bottlenecks in Scientific Applications
AU - Lumsden, Ian
AU - Luettgau, Jakob
AU - Lama, Vanessa
AU - Scully-Allison, Connor
AU - Brink, Stephanie
AU - Isaacs, Katherine E.
AU - Pearce, Olga
AU - Taufer, Michela
N1 - Funding Information:
This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory (LLNL) under Contract DE-AC52-07NA27344 (LLNL-CONF-835691). The UTK group acknowledges the support of NSF (#1841758, #1900888, and #2138811). The authors thank Dr. Abhinav Bhatele (University of Maryland), Dr. Todd Gamblin (LLNL), and Sanjukta Bhowmick (University of Northern Texas) for their feedback during the preparation of this work.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - As computational science applications benefit from larger-scale, more heterogeneous high performance computing (HPC) systems, the process of studying their performance becomes increasingly complex. The performance data analysis library Hatchet provides some insights into this complexity, but is currently limited in its analysis capabilities. Missing capabilities include the handling of relational caller-callee data captured by HPC profilers. To address this shortcoming, we augment Hatchet with a Call Path Query Language that leverages relational data in the performance analysis of scientific applications. Specifically, our Query Language enables data reduction using call path pattern matching. We demonstrate the effectiveness of our Query Language in identifying performance bottlenecks and enhancing Hatchet's analysis capabilities through three case studies. In the first case study, we compare the performance of sequential and multi-threaded versions of the graph alignment application Fido. In doing so, we identify the existence of large memory inefficiencies in both versions. In the second case study, we examine the performance of MPI calls in the linear algebra mini-application AMG2013 when using MVAPICH and Spectrum-MPI. In doing so, we identify hidden performance losses in specific MPI functions. In the third case study, we illustrate the use of our Query Language in Hatchet's interactive visualization. In doing so, we show that our Query Language enables a simple and intuitive way to massively reduce profiling data.
AB - As computational science applications benefit from larger-scale, more heterogeneous high performance computing (HPC) systems, the process of studying their performance becomes increasingly complex. The performance data analysis library Hatchet provides some insights into this complexity, but is currently limited in its analysis capabilities. Missing capabilities include the handling of relational caller-callee data captured by HPC profilers. To address this shortcoming, we augment Hatchet with a Call Path Query Language that leverages relational data in the performance analysis of scientific applications. Specifically, our Query Language enables data reduction using call path pattern matching. We demonstrate the effectiveness of our Query Language in identifying performance bottlenecks and enhancing Hatchet's analysis capabilities through three case studies. In the first case study, we compare the performance of sequential and multi-threaded versions of the graph alignment application Fido. In doing so, we identify the existence of large memory inefficiencies in both versions. In the second case study, we examine the performance of MPI calls in the linear algebra mini-application AMG2013 when using MVAPICH and Spectrum-MPI. In doing so, we identify hidden performance losses in specific MPI functions. In the third case study, we illustrate the use of our Query Language in Hatchet's interactive visualization. In doing so, we show that our Query Language enables a simple and intuitive way to massively reduce profiling data.
KW - High Performance Computing
KW - Message Passing
KW - Performance Analysis
KW - Query Language
KW - Scientific Applications
UR - http://www.scopus.com/inward/record.url?scp=85145436614&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85145436614&partnerID=8YFLogxK
U2 - 10.1109/eScience55777.2022.00039
DO - 10.1109/eScience55777.2022.00039
M3 - Conference contribution
AN - SCOPUS:85145436614
T3 - Proceedings - 2022 IEEE 18th International Conference on e-Science, eScience 2022
SP - 256
EP - 266
BT - Proceedings - 2022 IEEE 18th International Conference on e-Science, eScience 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 18th IEEE International Conference on e-Science, eScience 2022
Y2 - 10 October 2022 through 14 October 2022
ER -