TY - JOUR
T1 - Ordering Traces Logically to Identify Lateness in Message Passing Programs
AU - Isaacs, Katherine E.
AU - Gamblin, Todd
AU - Bhatele, Abhinav
AU - Schulz, Martin
AU - Hamann, Bernd
AU - Bremer, Peer Timo
N1 - Publisher Copyright:
© 1990-2012 IEEE.
PY - 2016/3/1
Y1 - 2016/3/1
N2 - Event traces are valuable for understanding the behavior of parallel programs. However, automatically analyzing a large parallel trace is difficult, especially without a specific objective. We aid this endeavor by extracting a trace's logical structure, an ordering of trace events derived from happened-before relationships, while taking into account developer intent. Using this structure, we can calculate an operation's delay relative to its peers on other processes. The logical structure also serves as a platform for comparing and clustering processes as well as highlighting communication patterns in a trace visualization. We present an algorithm for determining this idealized logical structure from traces of message passing programs, and we develop metrics to quantify delays and differences among processes. We implement our techniques in Ravel, a parallel trace visualization tool that displays both logical and physical timelines. Rather than showing the duration of each operation, we display where delays begin and end, and how they propagate. We apply our approach to the traces of several message passing applications, demonstrating the accuracy of our extracted structure and its utility in analyzing these codes.
AB - Event traces are valuable for understanding the behavior of parallel programs. However, automatically analyzing a large parallel trace is difficult, especially without a specific objective. We aid this endeavor by extracting a trace's logical structure, an ordering of trace events derived from happened-before relationships, while taking into account developer intent. Using this structure, we can calculate an operation's delay relative to its peers on other processes. The logical structure also serves as a platform for comparing and clustering processes as well as highlighting communication patterns in a trace visualization. We present an algorithm for determining this idealized logical structure from traces of message passing programs, and we develop metrics to quantify delays and differences among processes. We implement our techniques in Ravel, a parallel trace visualization tool that displays both logical and physical timelines. Rather than showing the duration of each operation, we display where delays begin and end, and how they propagate. We apply our approach to the traces of several message passing applications, demonstrating the accuracy of our extracted structure and its utility in analyzing these codes.
KW - Trace analysis
KW - performance
UR - http://www.scopus.com/inward/record.url?scp=84962487379&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84962487379&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2015.2417531
DO - 10.1109/TPDS.2015.2417531
M3 - Article
AN - SCOPUS:84962487379
SN - 1045-9219
VL - 27
SP - 829
EP - 840
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 3
M1 - 7072533
ER -