TY - GEN
T1 - ParSy
T2 - 2018 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018
AU - Cheshmi, Kazem
AU - Kamil, Shoaib
AU - Strout, Michelle Mills
AU - Dehnavi, Maryam Mehri
N1 - Funding Information:
This work is supported by the U.S. National Science Foundation (NSF) Award Numbers CCF-1657175 and CCF-1563732 and Adobe Research. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by the NSF grant number ACI-1548562.
Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - In this work, we describe ParSy, a framework that uses a novel inspection strategy along with a simple code transformation to optimize parallel sparse algorithms for shared memory processors. Unlike existing approaches that can suffer from load imbalance and excessive synchronization, ParSy uses a novel task coarsening strategy to create well-balanced tasks that can execute in parallel, while maintaining locality of memory accesses. Code using the ParSy inspector and transformation outperforms existing highly-optimized sparse matrix algorithms such as Cholesky factorization on multi-core processors with speedups of 2.8× and 3.1× over the MKL Pardiso and PaStiX libraries respectively.
AB - In this work, we describe ParSy, a framework that uses a novel inspection strategy along with a simple code transformation to optimize parallel sparse algorithms for shared memory processors. Unlike existing approaches that can suffer from load imbalance and excessive synchronization, ParSy uses a novel task coarsening strategy to create well-balanced tasks that can execute in parallel, while maintaining locality of memory accesses. Code using the ParSy inspector and transformation outperforms existing highly-optimized sparse matrix algorithms such as Cholesky factorization on multi-core processors with speedups of 2.8× and 3.1× over the MKL Pardiso and PaStiX libraries respectively.
KW - Domain-specific code generation
KW - Loop transformations
KW - Matrix computations
KW - Parallel algorithms
UR - http://www.scopus.com/inward/record.url?scp=85062795995&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85062795995&partnerID=8YFLogxK
U2 - 10.1109/SC.2018.00065
DO - 10.1109/SC.2018.00065
M3 - Conference contribution
AN - SCOPUS:85062795995
T3 - Proceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018
SP - 779
EP - 793
BT - Proceedings - International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 11 November 2018 through 16 November 2018
ER -