TY - GEN
T1 - Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster
AU - Springer, Robert
AU - Lowenthal, David K.
AU - Rountree, Barry
AU - Freeh, Vincent W.
PY - 2006
Y1 - 2006
N2 - Recently, the high-performance computing community has realized that power is a performance-limiting factor. One reason for this is that supercomputing centers have limited power capacity and machines are starting to hit that limit. In addition, the cost of energy has become increasingly significant, and the heat produced by higher-energy components tends to reduce their reliability. One way to reduce power (and therefore energy) requirements is to use high-performance cluster nodes that are frequency- and voltage-scalable (e.g., AMD-64 processors). The problem we address in this paper is: given a target program, a power-scalable cluster, and an upper limit for energy consumption, choose a schedule (number of nodes and CPU frequency) that simultaneously (1) satisfies an external upper limit for energy consumption and (2) minimizes execution time. There are too many schedules for an exhaustive search. Therefore, we find a schedule through a novel combination of performance modeling, performance prediction, and program execution. Using our technique, we are able to find a near-optimal schedule for all of our benchmarks in just a handful of partial program executions.
AB - Recently, the high-performance computing community has realized that power is a performance-limiting factor. One reason for this is that supercomputing centers have limited power capacity and machines are starting to hit that limit. In addition, the cost of energy has become increasingly significant, and the heat produced by higher-energy components tends to reduce their reliability. One way to reduce power (and therefore energy) requirements is to use high-performance cluster nodes that are frequency- and voltage-scalable (e.g., AMD-64 processors). The problem we address in this paper is: given a target program, a power-scalable cluster, and an upper limit for energy consumption, choose a schedule (number of nodes and CPU frequency) that simultaneously (1) satisfies an external upper limit for energy consumption and (2) minimizes execution time. There are too many schedules for an exhaustive search. Therefore, we find a schedule through a novel combination of performance modeling, performance prediction, and program execution. Using our technique, we are able to find a near-optimal schedule for all of our benchmarks in just a handful of partial program executions.
KW - Energy
KW - MPI
KW - Modeling
KW - Power
KW - Prediction
UR - http://www.scopus.com/inward/record.url?scp=33751054291&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33751054291&partnerID=8YFLogxK
U2 - 10.1145/1122971.1123006
DO - 10.1145/1122971.1123006
M3 - Conference contribution
AN - SCOPUS:33751054291
SN - 1595931899
SN - 9781595931894
T3 - Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP
SP - 230
EP - 238
BT - Proceedings of the 2006 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP'06
PB - Association for Computing Machinery
T2 - 2006 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP'06
Y2 - 29 March 2006 through 31 March 2006
ER -