Heterogeneous manycores comprised of CPUs, GPUs and accelerators are putting stringent demands on network-on-chips (NoCs). The NoCs need to support the combined traffic, including both latency-sensitive CPU traffic and throughput-sensitive GPU and accelerator traffic. We study the characteristics of the combined traffic, and observe that (1) the limited injection bandwidth is the main obstacle to throughput improvement, and (2) the latency due to local and global contention accounts for a significant portion of the network latency. We propose a router architecture named ALPHA for heterogeneous manycores. ALPHA introduces two new optimizations: (1) increasing injection bandwidth to improve throughput, and (2) resolving local and global contention to reduce network latency. Specifically, ALPHA increases the injection bandwidth through modifications to injection link, crossbar switch and buffer organization in the injection port of the router; ALPHA identifies the upcoming local contention and resolves it by optimally selecting traffic routes; ALPHA detects and alleviates the global contention by utilizing a supervised learning engine for traffic analysis, prediction, and adjustment. Simulation results using Rodinia benchmark show that ALPHA provides 28 percent throughput increase, 24 percent latency reduction, 22 percent execution time speedup, and 19 percent energy efficiency improvement, compared to the baseline router.
- Heterogeneous manycore
- supervised learning
ASJC Scopus subject areas
- Renewable Energy, Sustainability and the Environment
- Hardware and Architecture
- Control and Optimization
- Computational Theory and Mathematics