Generalizing run-time tiling with the loop chain abstraction

Michelle Mills Strout, Fabio Luporini, Christopher D. Krieger, Carlo Bertolli, Gheorghe Teodor Bercea, Catherine Olschanowsky, J. Ramanujam, Paul H.J. Kelly

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Scopus citations

Abstract

Many scientific applications are organized in a data parallel way: as sequences of parallel and/or reduction loops. This exposes parallelism well, but does not convert data reuse between loops into data locality. This paper focuses on this issue in parallel loops whose loop-to-loop dependence structure is data-dependent due to indirect references such as A[B[i]]. Such references are a common occurrence in sparse matrix computations, molecular dynamics simulations, and unstructured-mesh computational fluid dynamics (CFD). Previously, sparse tiling approaches were developed for individual benchmarks to group iterations across such loops to improve data locality. These approaches were shown to benefit applications such as moldyn, Gauss-Seidel, and the sparse matrix powers kernel, however the run-time routines for performing sparse tiling were hand coded per application. In this paper, we present a generalized full sparse tiling algorithm that uses the newly developed loop chain abstraction as input, improves inter-loop data locality, and creates a task graph to expose shared-memory parallelism at runtime. We evaluate the overhead and performance impact of the generalized full sparse tiling algorithm on two codes: a sparse Jacobi iterative solver and the Airfoil CFD benchmark.

Original languageEnglish (US)
Title of host publicationProceedings - IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS 2014
PublisherIEEE Computer Society
Pages1136-1145
Number of pages10
ISBN (Print)9780769552071
DOIs
StatePublished - 2014
Externally publishedYes
Event28th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2014 - Phoenix, AZ, United States
Duration: May 19 2014May 23 2014

Publication series

NameProceedings of the International Parallel and Distributed Processing Symposium, IPDPS
ISSN (Print)1530-2075
ISSN (Electronic)2332-1237

Other

Other28th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2014
Country/TerritoryUnited States
CityPhoenix, AZ
Period5/19/145/23/14

Keywords

  • inspector/executor
  • run-time reordering transformations
  • tiling

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Networks and Communications
  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'Generalizing run-time tiling with the loop chain abstraction'. Together they form a unique fingerprint.

Cite this