Inter-loop optimizations in RAJA using loop chains

Brandon Neth, Thomas R.W. Scogland, Bronis R. de Supinski, Michelle Mills Strout

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Typical parallelization approaches such as OpenMP and CUDA provide constructs for parallelizing and blocking for data locality for individual loops. By focusing on each loop separately, these approaches fail to leverage sources of data locality possible due to inter-loop data reuse. The loop chain abstraction provides a framework for reasoning about and applying inter-loop optimizations. In this work, we incorporate the loop chain abstraction into RAJA, a performance portability layer for high-performance computing applications. Using the loop-chain-extended RAJA, or RAJALC, developers can have the RAJA library apply loop transformations like loop fusion and overlapped tiling while maintaining the original structure of their programs. By introducing targeted symbolic evaluation capabilities, we can collect and cache data access information required to verify loop transformations. We evaluate the performance improvement and refactoring costs of our extension. Overall, our results demonstrate 85-98% of the performance improvements of hand-optimized kernels with dramatically fewer code changes.

Original languageEnglish (US)
Title of host publicationICS 2021 - Proceedings of the 2021 ACM International Conference on Supercomputing
PublisherAssociation for Computing Machinery
Pages1-12
Number of pages12
ISBN (Electronic)9781450383356
DOIs
StatePublished - Jun 3 2021
Externally publishedYes
Event35th ACM International Conference on Supercomputing, ICS 2021 - Virtual, Online, United States
Duration: Jun 14 2021Jun 17 2021

Publication series

NameProceedings of the International Conference on Supercomputing

Conference

Conference35th ACM International Conference on Supercomputing, ICS 2021
Country/TerritoryUnited States
CityVirtual, Online
Period6/14/216/17/21

Keywords

  • C++
  • Data locality
  • Loop chains
  • Performance portability
  • Polyhedral analysis
  • RAJA
  • Symbolic execution

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint

Dive into the research topics of 'Inter-loop optimizations in RAJA using loop chains'. Together they form a unique fingerprint.

Cite this