TY - GEN
T1 - Exploring the Sparsity-Quantization Interplay on a Novel Hybrid SNN Event-Driven Architecture
AU - Aliyev, Ilkin
AU - Lopez, Jesus
AU - Adegbija, Tosiron
N1 - Publisher Copyright:
© 2025 EDAA.
PY - 2025
Y1 - 2025
N2 - Spiking Neural Networks (SNNs) offer potential advantages in energy efficiency but currently trail Artificial Neural Networks (ANNs) in versatility, largely due to challenges in efficient input encoding. Recent work shows that direct coding achieves superior accuracy with fewer timesteps than traditional rate coding. However, there is a lack of specialized hardware to fully exploit the potential of direct-coded SNNs, especially their mix of dense and sparse layers. This work proposes the first hybrid inference architecture for direct-coded SNNs. The proposed hardware architecture comprises a dense core to efficiently process the input layer and sparse cores optimized for event-driven spiking convolutions. Furthermore, for the first time, we investigate and quantify the quantization effect on sparsity. Our experiments on two variations of the VGG9 network and implemented on a Xilinx Virtex UltraScale+ FPGA (Field-Programmable Gate Array) reveal two novel findings. Firstly, quantization increases the network sparsity by up to 15.2% with minimal loss of accuracy. Combined with the inherent low power benefits, this leads to a 3.4× improvement in energy compared to the full-precision version. Secondly, direct coding outperforms rate coding, achieving a 10% improvement in accuracy and consuming 26.4× less energy per image. Overall, our accelerator1 achieves 51 × higher throughput and consumes half the power compared to previous work.
AB - Spiking Neural Networks (SNNs) offer potential advantages in energy efficiency but currently trail Artificial Neural Networks (ANNs) in versatility, largely due to challenges in efficient input encoding. Recent work shows that direct coding achieves superior accuracy with fewer timesteps than traditional rate coding. However, there is a lack of specialized hardware to fully exploit the potential of direct-coded SNNs, especially their mix of dense and sparse layers. This work proposes the first hybrid inference architecture for direct-coded SNNs. The proposed hardware architecture comprises a dense core to efficiently process the input layer and sparse cores optimized for event-driven spiking convolutions. Furthermore, for the first time, we investigate and quantify the quantization effect on sparsity. Our experiments on two variations of the VGG9 network and implemented on a Xilinx Virtex UltraScale+ FPGA (Field-Programmable Gate Array) reveal two novel findings. Firstly, quantization increases the network sparsity by up to 15.2% with minimal loss of accuracy. Combined with the inherent low power benefits, this leads to a 3.4× improvement in energy compared to the full-precision version. Secondly, direct coding outperforms rate coding, achieving a 10% improvement in accuracy and consuming 26.4× less energy per image. Overall, our accelerator1 achieves 51 × higher throughput and consumes half the power compared to previous work.
KW - SNN accelerator
KW - Spiking neural networks
KW - neuromorphic computing
KW - quantization
KW - sparsity-aware SNN
UR - https://www.scopus.com/pages/publications/105006899402
UR - https://www.scopus.com/inward/citedby.url?scp=105006899402&partnerID=8YFLogxK
U2 - 10.23919/DATE64628.2025.10992893
DO - 10.23919/DATE64628.2025.10992893
M3 - Conference contribution
AN - SCOPUS:105006899402
T3 - Proceedings -Design, Automation and Test in Europe, DATE
BT - 2025 Design, Automation and Test in Europe Conference, DATE 2025 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2025 Design, Automation and Test in Europe Conference, DATE 2025
Y2 - 31 March 2025 through 2 April 2025
ER -