TY - GEN
T1 - A fast parallel graph partitioner for shared-memory inspector/executor strategies
AU - Krieger, Christopher D.
AU - Strout, Michelle Mills
PY - 2013
Y1 - 2013
N2 - Graph partitioners play an important role in many parallel work distribution and locality optimization approaches. Surprisingly, however, to our knowledge there is no freely available parallel graph partitioner designed for execution on a shared memory multicore system. This paper presents a shared memory parallel graph partitioner, ParCubed, for use in the context of sparse tiling run-time data and computation reordering. Sparse tiling is a run-time scheduling technique that schedules groups of iterations across loops together when they access the same data and one or more of the loops contains indirect array accesses. For sparse tiling, which is implemented with an inspector/executor strategy, the inspector needs to find an initial seed partitioning of adequate quality very quickly. We compare our presented hierarchical clustering partitioner, ParCubed, with GPart and METIS in terms of partitioning speed, partitioning quality, and the effect the generated seed partitions have on executor speed. We find that the presented partitioner is 25 to 100 times faster than METIS on a 16 core machine. The total edge cut of the partitioning generated by ParCubed was found not to exceed 1.27x that of the partitioning found by METIS.
AB - Graph partitioners play an important role in many parallel work distribution and locality optimization approaches. Surprisingly, however, to our knowledge there is no freely available parallel graph partitioner designed for execution on a shared memory multicore system. This paper presents a shared memory parallel graph partitioner, ParCubed, for use in the context of sparse tiling run-time data and computation reordering. Sparse tiling is a run-time scheduling technique that schedules groups of iterations across loops together when they access the same data and one or more of the loops contains indirect array accesses. For sparse tiling, which is implemented with an inspector/executor strategy, the inspector needs to find an initial seed partitioning of adequate quality very quickly. We compare our presented hierarchical clustering partitioner, ParCubed, with GPart and METIS in terms of partitioning speed, partitioning quality, and the effect the generated seed partitions have on executor speed. We find that the presented partitioner is 25 to 100 times faster than METIS on a 16 core machine. The total edge cut of the partitioning generated by ParCubed was found not to exceed 1.27x that of the partitioning found by METIS.
KW - Graph partitioning
KW - Inspector/executor strategies
KW - Irregular applications
KW - Sparse tiling
UR - http://www.scopus.com/inward/record.url?scp=84893078692&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84893078692&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-37658-0_13
DO - 10.1007/978-3-642-37658-0_13
M3 - Conference contribution
AN - SCOPUS:84893078692
SN - 9783642376573
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 190
EP - 204
BT - Languages and Compilers for Parallel Computing - 25th International Workshop, LCPC 2012, Revised Selected Papers
T2 - 25th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2012
Y2 - 11 September 2012 through 13 September 2012
ER -