TY - GEN
T1 - Workload capacity considering NBTI degradation in multi-core systems
AU - Sun, Jin
AU - Lysecky, Roman
AU - Shankar, Karthik
AU - Kodi, Avinash
AU - Louri, Ahmed
AU - Wang, Janet M.
PY - 2010
Y1 - 2010
N2 - As device feature sizes continue to shrink, long-term reliability such as Negative Bias Temperature Instability (NBTI) leads to low yields and short mean-time-to-failure (MTTF) in multi-core systems. This paper proposes a new workload balancing scheme based on device level fractional NBTI model to balance the workload among active cores while relaxing stressed ones. The proposed method employs the Capacity Rate (CR) provided by the NBTI model, applies Dynamic Zoning (DZ) algorithm to group cores into zones to process task flows, and then uses Dynamic Task Scheduling (DTS) to allocate tasks in each zone with balanced workload and minimum communication cost. Experimental results on 64-core system show that by allowing a small part of the cores to relax over a short time period (10 seconds), the proposed methodology improves multi-core system yield (percentage of core failures) by 20%, while extending MTTF by 30% with insignificant degradation in performance (less than 3%).
AB - As device feature sizes continue to shrink, long-term reliability such as Negative Bias Temperature Instability (NBTI) leads to low yields and short mean-time-to-failure (MTTF) in multi-core systems. This paper proposes a new workload balancing scheme based on device level fractional NBTI model to balance the workload among active cores while relaxing stressed ones. The proposed method employs the Capacity Rate (CR) provided by the NBTI model, applies Dynamic Zoning (DZ) algorithm to group cores into zones to process task flows, and then uses Dynamic Task Scheduling (DTS) to allocate tasks in each zone with balanced workload and minimum communication cost. Experimental results on 64-core system show that by allowing a small part of the cores to relax over a short time period (10 seconds), the proposed methodology improves multi-core system yield (percentage of core failures) by 20%, while extending MTTF by 30% with insignificant degradation in performance (less than 3%).
UR - https://www.scopus.com/pages/publications/77951213063
UR - https://www.scopus.com/pages/publications/77951213063#tab=citedBy
U2 - 10.1109/ASPDAC.2010.5419839
DO - 10.1109/ASPDAC.2010.5419839
M3 - Conference contribution
AN - SCOPUS:77951213063
SN - 9781424457656
T3 - Proceedings of the Asia and South Pacific Design Automation Conference, ASP-DAC
SP - 450
EP - 455
BT - 2010 15th Asia and South Pacific Design Automation Conference, ASP-DAC 2010
T2 - 2010 15th Asia and South Pacific Design Automation Conference, ASP-DAC 2010
Y2 - 18 January 2010 through 21 January 2010
ER -