TY - JOUR
T1 - Traveler
T2 - Navigating Task Parallel Traces for Performance Analysis
AU - Sakin, Sayef Azad
AU - Bigelow, Alex
AU - Tohid, R.
AU - Scully-Allison, Connor
AU - Scheidegger, Carlos
AU - Brandt, Steven R.
AU - Taylor, Christopher
AU - Huck, Kevin A.
AU - Kaiser, Hartmut
AU - Isaacs, Katherine E.
N1 - Funding Information:
This work was supported by the United States Department of Defense through DTIC Contract FA8075-14-D-002-007, the Department of Energy under DE-SC0022044, and the National Science Foundation under NSF IIS-1844573.
Publisher Copyright:
© 1995-2012 IEEE.
PY - 2023/1/1
Y1 - 2023/1/1
N2 - Understanding the behavior of software in execution is a key step in identifying and fixing performance issues. This is especially important in high performance computing contexts where even minor performance tweaks can translate into large savings in terms of computational resource use. To aid performance analysis, developers may collect an execution trace - a chronological log of program activity during execution. As traces represent the full history, developers can discover a wide array of possibly previously unknown performance issues, making them an important artifact for exploratory performance analysis. However, interactive trace visualization is difficult due to issues of data size and complexity of meaning. Traces represent nanosecond-level events across many parallel processes, meaning the collected data is often large and difficult to explore. The rise of asynchronous task parallel programming paradigms complicates the relation between events and their probable cause. To address these challenges, we conduct a continuing design study in collaboration with high performance computing researchers. We develop diverse and hierarchical ways to navigate and represent execution trace data in support of their trace analysis tasks. Through an iterative design process, we developed Traveler, an integrated visualization platform for task parallel traces. Traveler provides multiple linked interfaces to help navigate trace data from multiple contexts. We evaluate the utility of Traveler through feedback from users and a case study, finding that integrating multiple modes of navigation in our design supported performance analysis tasks and led to the discovery of previously unknown behavior in a distributed array library.
AB - Understanding the behavior of software in execution is a key step in identifying and fixing performance issues. This is especially important in high performance computing contexts where even minor performance tweaks can translate into large savings in terms of computational resource use. To aid performance analysis, developers may collect an execution trace - a chronological log of program activity during execution. As traces represent the full history, developers can discover a wide array of possibly previously unknown performance issues, making them an important artifact for exploratory performance analysis. However, interactive trace visualization is difficult due to issues of data size and complexity of meaning. Traces represent nanosecond-level events across many parallel processes, meaning the collected data is often large and difficult to explore. The rise of asynchronous task parallel programming paradigms complicates the relation between events and their probable cause. To address these challenges, we conduct a continuing design study in collaboration with high performance computing researchers. We develop diverse and hierarchical ways to navigate and represent execution trace data in support of their trace analysis tasks. Through an iterative design process, we developed Traveler, an integrated visualization platform for task parallel traces. Traveler provides multiple linked interfaces to help navigate trace data from multiple contexts. We evaluate the utility of Traveler through feedback from users and a case study, finding that integrating multiple modes of navigation in our design supported performance analysis tasks and led to the discovery of previously unknown behavior in a distributed array library.
KW - event sequence visualization
KW - parallel computing
KW - performance analysis
KW - software visualization
KW - traces
UR - http://www.scopus.com/inward/record.url?scp=85139479409&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85139479409&partnerID=8YFLogxK
U2 - 10.1109/TVCG.2022.3209375
DO - 10.1109/TVCG.2022.3209375
M3 - Article
C2 - 36166559
AN - SCOPUS:85139479409
SN - 1077-2626
VL - 29
SP - 788
EP - 797
JO - IEEE Transactions on Visualization and Computer Graphics
JF - IEEE Transactions on Visualization and Computer Graphics
IS - 1
ER -