TY - GEN
T1 - CSCNN
T2 - 27th Annual IEEE International Symposium on High Performance Computer Architecture, HPCA 2021
AU - Li, Jiajun
AU - Louri, Ahmed
AU - Karanth, Avinash
AU - Bunescu, Razvan
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/2
Y1 - 2021/2
N2 - Convolutional neural networks (CNNs) are at the core of many state-of-The-Art deep learning models in computer vision, speech, and text processing. Training and deploying such CNN-based architectures usually require a significant amount of computational resources. Sparsity has emerged as an effective compression approach for reducing the amount of data and computation for CNNs. However, sparsity often results in computational irregularity, which prevents accelerators from fully taking advantage of its benefits for performance and energy improvement. In this paper, we propose CSCNN, an algorithm/hardware co-design framework for CNN compression and acceleration that mitigates the effects of computational irregularity and provides better performance and energy efficiency. On the algorithmic side, CSCNN uses centrosymmetric matrices as convolutional filters. In doing so, it reduces the number of required weights by nearly 50% and enables structured computational reuse without compromising regularity and accuracy. Additionally, complementary pruning techniques are leveraged to further reduce computation by a factor of 2.8-7.2\times with a marginal accuracy loss. On the hardware side, we propose a CSCNN accelerator that effectively exploits the structured computational reuse enabled by centrosymmetric filters, and further eliminates zero computations for increased performance and energy efficiency. Compared against a dense accelerator, SCNN and SparTen, the proposed accelerator performs 3.7\times , 1.6\times and 1.3\times better, and improves the EDP (Energy Delay Product) by 8.9\times , 2.8\times and 2.0\times , respectively.
AB - Convolutional neural networks (CNNs) are at the core of many state-of-The-Art deep learning models in computer vision, speech, and text processing. Training and deploying such CNN-based architectures usually require a significant amount of computational resources. Sparsity has emerged as an effective compression approach for reducing the amount of data and computation for CNNs. However, sparsity often results in computational irregularity, which prevents accelerators from fully taking advantage of its benefits for performance and energy improvement. In this paper, we propose CSCNN, an algorithm/hardware co-design framework for CNN compression and acceleration that mitigates the effects of computational irregularity and provides better performance and energy efficiency. On the algorithmic side, CSCNN uses centrosymmetric matrices as convolutional filters. In doing so, it reduces the number of required weights by nearly 50% and enables structured computational reuse without compromising regularity and accuracy. Additionally, complementary pruning techniques are leveraged to further reduce computation by a factor of 2.8-7.2\times with a marginal accuracy loss. On the hardware side, we propose a CSCNN accelerator that effectively exploits the structured computational reuse enabled by centrosymmetric filters, and further eliminates zero computations for increased performance and energy efficiency. Compared against a dense accelerator, SCNN and SparTen, the proposed accelerator performs 3.7\times , 1.6\times and 1.3\times better, and improves the EDP (Energy Delay Product) by 8.9\times , 2.8\times and 2.0\times , respectively.
KW - Convolutional Neural Networks
KW - Domain-specific Accelerators
UR - http://www.scopus.com/inward/record.url?scp=85104955173&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85104955173&partnerID=8YFLogxK
U2 - 10.1109/HPCA51647.2021.00058
DO - 10.1109/HPCA51647.2021.00058
M3 - Conference contribution
AN - SCOPUS:85104955173
T3 - Proceedings - International Symposium on High-Performance Computer Architecture
SP - 612
EP - 625
BT - Proceeding - 27th IEEE International Symposium on High Performance Computer Architecture, HPCA 2021
PB - IEEE Computer Society
Y2 - 27 February 2021 through 1 March 2021
ER -