TY - GEN
T1 - NBTI aware workload balancing in multi-core systems
AU - Sun, Jin
AU - Kodi, Avinash
AU - Louri, Ahmed
AU - Wang, Janet M.
PY - 2009
Y1 - 2009
N2 - As device feature size continues to shrink, reliability becomes a severe issue due to process variation, particle-induced transient errors, and transistor wear-out/stress such as Negative Bias Temperature Instability (NBTI). Unless this problem is addressed, chip multi-processor (CMP) systems face low yields and short mean-time-to-failure (MTTF). This paper proposes a new design framework for multi-core system that includes device wear-out impact. Based on device fractional NBTI model, we propose a new NBTI aware system workload model, and develop new dynamic tile partition (DTP) algorithm to balance workload among active cores while relaxing stressed ones. Experimental results on 64 cores show that by allowing a small number of cores (around 10%)to relax in a short time period (10 second), the proposed methodology improves CMP system yield. We use the percentage of core failure to represent the yield improvement. The new strategy improves the core failure number by 20 %, and extend MTTF by 30% with little degradation in performance (less than 6%).
AB - As device feature size continues to shrink, reliability becomes a severe issue due to process variation, particle-induced transient errors, and transistor wear-out/stress such as Negative Bias Temperature Instability (NBTI). Unless this problem is addressed, chip multi-processor (CMP) systems face low yields and short mean-time-to-failure (MTTF). This paper proposes a new design framework for multi-core system that includes device wear-out impact. Based on device fractional NBTI model, we propose a new NBTI aware system workload model, and develop new dynamic tile partition (DTP) algorithm to balance workload among active cores while relaxing stressed ones. Experimental results on 64 cores show that by allowing a small number of cores (around 10%)to relax in a short time period (10 second), the proposed methodology improves CMP system yield. We use the percentage of core failure to represent the yield improvement. The new strategy improves the core failure number by 20 %, and extend MTTF by 30% with little degradation in performance (less than 6%).
UR - http://www.scopus.com/inward/record.url?scp=67649656433&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=67649656433&partnerID=8YFLogxK
U2 - 10.1109/ISQED.2009.4810400
DO - 10.1109/ISQED.2009.4810400
M3 - Conference contribution
AN - SCOPUS:67649656433
SN - 9781424429530
T3 - Proceedings of the 10th International Symposium on Quality Electronic Design, ISQED 2009
SP - 833
EP - 838
BT - Proceedings of the 10th International Symposium on Quality Electronic Design, ISQED 2009
T2 - 10th International Symposium on Quality Electronic Design, ISQED 2009
Y2 - 16 March 2009 through 18 March 2009
ER -