TY - GEN
T1 - Code Synthesis for Sparse Tensor Format Conversion and Optimization
AU - Popoola, Tobi
AU - Zhao, Tuowen
AU - St. George, Aaron
AU - Bhetwal, Kalyan
AU - Strout, Michelle Mills
AU - Hall, Mary
AU - Olschanowsky, Catherine
N1 - Publisher Copyright:
© 2023 ACM.
PY - 2023/2/17
Y1 - 2023/2/17
N2 - Many scientific applications compute using sparse data and store that data in a variety of sparse formats because each format has unique space and performance benefits. Optimizing applications that use sparse data involves translating the sparse data into the chosen format and transforming the computation to iterate over that format. This paper presents a formal definition of sparse tensor formats and an automated approach to synthesize the transformation between formats. This approach is unique in that it supports ordering constraints not supported by other approaches and synthesizes the transformation code in a high-level intermediate representation suitable for applying composable transformations such as loop fusion and temporary storage reduction. We demonstrate that the synthesized code for COO to CSR with optimizations is 2.85x faster than TACO, Intel MKL, and SPARSKIT while the more complex COO to DIA is 1.4x slower than TACO but faster than SPARSKIT and Intel MKL using the geometric average of execution time.
AB - Many scientific applications compute using sparse data and store that data in a variety of sparse formats because each format has unique space and performance benefits. Optimizing applications that use sparse data involves translating the sparse data into the chosen format and transforming the computation to iterate over that format. This paper presents a formal definition of sparse tensor formats and an automated approach to synthesize the transformation between formats. This approach is unique in that it supports ordering constraints not supported by other approaches and synthesizes the transformation code in a high-level intermediate representation suitable for applying composable transformations such as loop fusion and temporary storage reduction. We demonstrate that the synthesized code for COO to CSR with optimizations is 2.85x faster than TACO, Intel MKL, and SPARSKIT while the more complex COO to DIA is 1.4x slower than TACO but faster than SPARSKIT and Intel MKL using the geometric average of execution time.
KW - Code synthesis
KW - Index array properties
KW - Polyhedral compilation
KW - Sparse Format Conversion
KW - Sparser Format Descriptors
KW - Transformations
KW - Uninterpreted functions
UR - http://www.scopus.com/inward/record.url?scp=85149770020&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85149770020&partnerID=8YFLogxK
U2 - 10.1145/3579990.3580021
DO - 10.1145/3579990.3580021
M3 - Conference contribution
AN - SCOPUS:85149770020
T3 - CGO 2023 - Proceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization
SP - 28
EP - 40
BT - CGO 2023 - Proceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization
A2 - Dubach, Christophe
A2 - Bruening, Derek
A2 - Hardekopf, Ben
PB - Association for Computing Machinery, Inc
T2 - 21st ACM/IEEE International Symposium on Code Generation and Optimization, CGO 2023
Y2 - 25 February 2023 through 1 March 2023
ER -