AdaPrune: An Accelerator-Aware Pruning Technique for Sustainable CNN Accelerators

Jiajun Li, Ahmed Louri

Research output: Contribution to journalArticlepeer-review

Abstract

Convolutional neural network (CNN) accelerators have achieved great success from cloud to edge scenarios. However, given the trend towards even larger and deeper neural network models, it remains a challenging problem to efficiently process these CNNs especially on edge devices with limited energy budget. Accordingly, reducing the energy consumption is of paramount importance for sustainable CNN accelerators. In this paper, we propose AdaPrune, a novel pruning technique that reduces model size and computation to achieve performance improvement and energy savings for CNN accelerators. Unlike previous pruning techniques that sacrifice either computational regularity or accuracy, AdaPrune maintains both by customizing CNN pruning for the underlying accelerators to maximally leverage the sparsity benefits. AdaPrune consists of two techniques: input channel group pruning and output channel group pruning. By analyzing the weight fetching patterns of sparse CNN accelerators, AdaPrune adaptively switches between the two techniques to guarantee that the zeros are evenly distributed in each fetching group. In doing so, the pruned network structure preserves customized computational regularity for the underlying accelerators, thereby boosting the performance and energy efficiency. We evaluate AdaPrune on three sparse CNN accelerators with different spatial tiling strategies. The experimental results show that AdaPrune achieves up to 1.6× performance speedup, and 1.5× energy savings compared to unstructured pruning.

Original languageEnglish (US)
Pages (from-to)47-60
Number of pages14
JournalIEEE Transactions on Sustainable Computing
Volume7
Issue number1
DOIs
StatePublished - 2022

Keywords

  • Convolutional neural networks
  • model compression
  • weight pruning

ASJC Scopus subject areas

  • Software
  • Renewable Energy, Sustainability and the Environment
  • Hardware and Architecture
  • Control and Optimization
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'AdaPrune: An Accelerator-Aware Pruning Technique for Sustainable CNN Accelerators'. Together they form a unique fingerprint.

Cite this