TY - JOUR
T1 - Just-in-time dynamic voltage scaling
T2 - Exploiting inter-node slack to save energy in MPI programs
AU - Freeh, Vincent W.
AU - Kappiah, Nandini
AU - Lowenthal, David K.
AU - Bletsch, Tyler K.
N1 - Funding Information:
This research was funded in part by a University Partnership award from IBM and NSF grants CCF-0429643 and CNS-0410203.
PY - 2008/9
Y1 - 2008/9
N2 - Although users of high-performance computing are most interested in raw performance, both energy and power consumption have become critical concerns. As a result, improving energy efficiency of nodes on HPC machines has become important, and the prevalence of power-scalable clusters, where the frequency and voltage can be dynamically modified, has increased. On power-scalable clusters, one opportunity for saving energy with little or no loss of performance exists when the computational load is not perfectly balanced. This situation occurs frequently, as keeping the load balanced between nodes is one of the long-standing fundamental problems in parallel and distributed computing. Indeed, despite the large body of research aimed at balancing load both statically and dynamically, this problem is quite difficult to solve. This paper presents a system called Jitter that reduces the frequency and voltage on nodes that are assigned less computation and, therefore, have idle or slack time. This saves energy on these nodes, and the goal of Jitter is to attempt to ensure that they arrive "just in time" so that they avoid increasing overall execution time. Specifically, we dynamically determine which nodes have enough slack time such that they can execute at a reduced frequency with little performance cost-which will greatly reduce the consumed energy on that node. In particular, Jitter saves 12.8% energy with 0.4% time increase-which is essentially the same as a hand-tuned solution-on the Aztec benchmark.
AB - Although users of high-performance computing are most interested in raw performance, both energy and power consumption have become critical concerns. As a result, improving energy efficiency of nodes on HPC machines has become important, and the prevalence of power-scalable clusters, where the frequency and voltage can be dynamically modified, has increased. On power-scalable clusters, one opportunity for saving energy with little or no loss of performance exists when the computational load is not perfectly balanced. This situation occurs frequently, as keeping the load balanced between nodes is one of the long-standing fundamental problems in parallel and distributed computing. Indeed, despite the large body of research aimed at balancing load both statically and dynamically, this problem is quite difficult to solve. This paper presents a system called Jitter that reduces the frequency and voltage on nodes that are assigned less computation and, therefore, have idle or slack time. This saves energy on these nodes, and the goal of Jitter is to attempt to ensure that they arrive "just in time" so that they avoid increasing overall execution time. Specifically, we dynamically determine which nodes have enough slack time such that they can execute at a reduced frequency with little performance cost-which will greatly reduce the consumed energy on that node. In particular, Jitter saves 12.8% energy with 0.4% time increase-which is essentially the same as a hand-tuned solution-on the Aztec benchmark.
KW - Distributed computing
KW - Message passing interface (MPI)
KW - Power-aware
UR - http://www.scopus.com/inward/record.url?scp=48849114531&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=48849114531&partnerID=8YFLogxK
U2 - 10.1016/j.jpdc.2008.04.007
DO - 10.1016/j.jpdc.2008.04.007
M3 - Article
AN - SCOPUS:48849114531
SN - 0743-7315
VL - 68
SP - 1175
EP - 1185
JO - Journal of Parallel and Distributed Computing
JF - Journal of Parallel and Distributed Computing
IS - 9
ER -