TY - GEN
T1 - Identifying and scheduling loop chains using directives
AU - Bertolacci, Ian J.
AU - Strout, Michelle Mills
AU - Guzik, Stephen
AU - Riley, Jordan
AU - Olschanowsky, Catherine
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2017/1/30
Y1 - 2017/1/30
N2 - Exposing opportunities for parallelization while explicitly managing data locality is the primary challenge to porting and optimizing existing computational science simulation codes to improve performance and accuracy. OpenMP provides many mechanisms for expressing parallelism, but it primarily remains the programmer's responsibility to group computations to improve data locality. The loopchain abstraction, where data access patterns are included with the specification of parallel loops, provides compilers with sufficient information to automate the parallelism versus data locality tradeoff. In this paper, we present a loop chain pragma and an extension to the omp for to enable the specification of loop chains and high-level specifications of schedules on loop chains. We show example usage of the extensions, describe their implementation, and show preliminary performance results for some simple examples.
AB - Exposing opportunities for parallelization while explicitly managing data locality is the primary challenge to porting and optimizing existing computational science simulation codes to improve performance and accuracy. OpenMP provides many mechanisms for expressing parallelism, but it primarily remains the programmer's responsibility to group computations to improve data locality. The loopchain abstraction, where data access patterns are included with the specification of parallel loops, provides compilers with sufficient information to automate the parallelism versus data locality tradeoff. In this paper, we present a loop chain pragma and an extension to the omp for to enable the specification of loop chains and high-level specifications of schedules on loop chains. We show example usage of the extensions, describe their implementation, and show preliminary performance results for some simple examples.
UR - http://www.scopus.com/inward/record.url?scp=85015177221&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85015177221&partnerID=8YFLogxK
U2 - 10.1109/WACCPD.2016.010
DO - 10.1109/WACCPD.2016.010
M3 - Conference contribution
AN - SCOPUS:85015177221
T3 - Proceedings of WACCPD 2016: 3rd Workshop on Accelerator Programming using Directives - Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 57
EP - 67
BT - Proceedings of WACCPD 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd Workshop on Accelerator Programming using Directives, WACCPD 2016
Y2 - 14 November 2016
ER -