Abstract
With continuously shrinking technology, reliability issues such as Negative Bias Temperature Instability (NBTI) has resulted in considerable degradation of device performance, and eventually the short mean-timeto- failure (MTTF) of the whole multicore system. This article proposes a new workload balancing scheme based on device-level fractional NBTI model to balance the workload among active cores while relaxing stressed ones. Starting with NBTI-induced threshold voltage degradation, we define a concept of Capacity Rate (CR) as an indication of one core's ability to accept workload. Capacity rate captures core's performance variability in terms of delay and power metrics under the impact of NBTI aging. The proposed workload balancing framework employs the capacity rates as workload constraints, applies a Dynamic Zoning (DZ) algorithm to group cores into zones to process task flows, and then uses Dynamic Task Scheduling (DTS) to allocate tasks in each zone with balanced workload and minimum communication cost. Experimental results on a 64-core system show that by allowing a small part of the cores to relax over a short time period, the proposed methodology improves multicore system yield (percentage of core failures) by 20%, while extending MTTF by 30% with insignificant degradation in performance (less than 3%).
Original language | English (US) |
---|---|
Article number | 4 |
Journal | ACM Journal on Emerging Technologies in Computing Systems |
Volume | 10 |
Issue number | 1 |
DOIs | |
State | Published - Jan 2014 |
Keywords
- Dynamic task scheduling
- Dynamic zoning
- Multicore systems
- Negative bias temperature instability capacity rate
- Workload balancing
ASJC Scopus subject areas
- Software
- Hardware and Architecture
- Electrical and Electronic Engineering