SGCNAX: A Scalable Graph Convolutional Neural Network Accelerator With Workload Balancing

Jiajun Li, Hao Zheng, Ke Wang, Ahmed Louri

Research output: Contribution to journalArticlepeer-review

9 Scopus citations


Convolutional Neural Networks (GCNs) have emerged as promising tools for graph-based machine learning applications. Given that GCNs are both compute- and memory-intensive, this constitutes a major challenge for the underlying hardware to efficiently process large-scale GCNs. In this article, we introduce SGCNAX, a scalable GCN accelerator architecture for the high-performance and energy-efficient acceleration of GCNs. Unlike prior GCN accelerators that either employ limited loop optimization techniques, or determine the design variables based on random sampling, we systematically explore the loop optimization techniques for GCN acceleration and propose a flexible GCN dataflow that adapts to different GCN configurations to achieve optimal efficiency. We further propose two hardware-based techniques to address the workload imbalance problem caused by the unbalanced distribution of zeros in GCNs. Specifically, SGCNAX exploits an outer-product-based computation architecture that mitigates the intra-PE (Processing Elements) workload imbalance, and employs a group-and-shuffle approach to mitigate the inter-PE workload imbalance. Simulation results show that SGCNAX performs 9.2×, 1.6× and 1.2× better, and reduces DRAM accesses by a factor of 9.7×, 2.9× and 1.2× compared to HyGCN, AWB-GCN, and GCNAX, respectively.

Original languageEnglish (US)
Pages (from-to)2834-2845
Number of pages12
JournalIEEE Transactions on Parallel and Distributed Systems
Issue number11
StatePublished - Nov 1 2022
Externally publishedYes


  • Graph convolutional neural networks
  • dataflow accelerators
  • domain-specific accelerators
  • memory access optimization

ASJC Scopus subject areas

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics


Dive into the research topics of 'SGCNAX: A Scalable Graph Convolutional Neural Network Accelerator With Workload Balancing'. Together they form a unique fingerprint.

Cite this