A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE Solvers

Catherine Olschanowsky, Michelle Mills Strout, Stephen Guzik, John Loffeld, Jeffrey Hittinger

Research output: Contribution to journalConference articlepeer-review

29 Scopus citations

Abstract

Structured-grid PDE solver frameworks parallelize over boxes, which are rectangular domains of cells or faces in a structured grid. In the Chombo framework, the box sizes are typically 163 or 323, but larger box sizes such as 1283 would result in less surface area and therefore less storage, copying, and/or ghost cells communication overhead. Unfortunately, current on node parallelization schemes perform poorly for these larger box sizes. In this paper, we investigate 30 different inter-loop optimization strategies and demonstrate the parallel scaling advantages of some of these variants on NUMA multicore nodes. Shifted, fused, and communication-avoiding variants for 1283 boxes result in close to ideal parallel scaling and come close to matching the performance of 163 boxes on three different multicore systems for a benchmark that is a proxy for program idioms found in Computational Fluid Dynamic (CFD) codes.

Original languageEnglish (US)
Article number7013052
Pages (from-to)793-804
Number of pages12
JournalInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
Volume2015-January
Issue numberJanuary
DOIs
StatePublished - Jan 16 2014
Externally publishedYes
EventInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC 2014 - New Orleans, United States
Duration: Nov 16 2014Nov 21 2014

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture
  • Software

Fingerprint

Dive into the research topics of 'A Study on Balancing Parallelism, Data Locality, and Recomputation in Existing PDE Solvers'. Together they form a unique fingerprint.

Cite this