TY - GEN
T1 - GCNAX
T2 - 27th Annual IEEE International Symposium on High Performance Computer Architecture, HPCA 2021
AU - Li, Jiajun
AU - Louri, Ahmed
AU - Karanth, Avinash
AU - Bunescu, Razvan
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/2
Y1 - 2021/2
N2 - Graph convolutional neural networks (GCNs) have emerged as an effective approach to extend deep learning for graph data analytics. Given that graphs are usually irregular, as nodes in a graph may have a varying number of neighbors, processing GCNs efficiently pose a significant challenge on the underlying hardware. Although specialized GCN accelerators have been proposed to deliver better performance over generic processors, prior accelerators not only under-utilize the compute engine, but also impose redundant data accesses that reduce throughput and energy efficiency. Therefore, optimizing the overall flow of data between compute engines and memory, i.e., the GCN dataflow, which maximizes utilization and minimizes data movement is crucial for achieving efficient GCN processing.In this paper, we propose a flexible and optimized dataflow for GCNs that simultaneously improves resource utilization and reduces data movement. This is realized by fully exploring the design space of GCN dataflows and evaluating the number of execution cycles and DRAM accesses through an analysis framework. Unlike prior GCN dataflows, which employ rigid loop orders and loop fusion strategies, the proposed dataflow can reconFigure the loop order and loop fusion strategy to adapt to different GCN configurations, which results in much improved efficiency. We then introduce a novel accelerator architecture called GCNAX, which tailors the compute engine, buffer structure and size based on the proposed dataflow. Evaluated on five real-world graph datasets, our simulation results show that GCNAX reduces DRAM accesses by a factor of 8.1 \times and 2.4 \times, while achieving 8.9 \times, 1.6 \times speedup and 9.5 \times, 2.3 \times energy savings on average over HyGCN and AWB-GCN, respectively.
AB - Graph convolutional neural networks (GCNs) have emerged as an effective approach to extend deep learning for graph data analytics. Given that graphs are usually irregular, as nodes in a graph may have a varying number of neighbors, processing GCNs efficiently pose a significant challenge on the underlying hardware. Although specialized GCN accelerators have been proposed to deliver better performance over generic processors, prior accelerators not only under-utilize the compute engine, but also impose redundant data accesses that reduce throughput and energy efficiency. Therefore, optimizing the overall flow of data between compute engines and memory, i.e., the GCN dataflow, which maximizes utilization and minimizes data movement is crucial for achieving efficient GCN processing.In this paper, we propose a flexible and optimized dataflow for GCNs that simultaneously improves resource utilization and reduces data movement. This is realized by fully exploring the design space of GCN dataflows and evaluating the number of execution cycles and DRAM accesses through an analysis framework. Unlike prior GCN dataflows, which employ rigid loop orders and loop fusion strategies, the proposed dataflow can reconFigure the loop order and loop fusion strategy to adapt to different GCN configurations, which results in much improved efficiency. We then introduce a novel accelerator architecture called GCNAX, which tailors the compute engine, buffer structure and size based on the proposed dataflow. Evaluated on five real-world graph datasets, our simulation results show that GCNAX reduces DRAM accesses by a factor of 8.1 \times and 2.4 \times, while achieving 8.9 \times, 1.6 \times speedup and 9.5 \times, 2.3 \times energy savings on average over HyGCN and AWB-GCN, respectively.
KW - Dataflow Accelerators
KW - Domain-specific Accelerators
KW - Graph Convolutional Neural Networks
UR - http://www.scopus.com/inward/record.url?scp=85104143274&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85104143274&partnerID=8YFLogxK
U2 - 10.1109/HPCA51647.2021.00070
DO - 10.1109/HPCA51647.2021.00070
M3 - Conference contribution
AN - SCOPUS:85104143274
T3 - Proceedings - International Symposium on High-Performance Computer Architecture
SP - 775
EP - 788
BT - Proceeding - 27th IEEE International Symposium on High Performance Computer Architecture, HPCA 2021
PB - IEEE Computer Society
Y2 - 27 February 2021 through 1 March 2021
ER -