TY - JOUR
T1 - A run-time system for power-constrained HPC applications
AU - Marathe, Aniruddha
AU - Bailey, Peter E.
AU - Lowenthal, David K.
AU - Rountree, Barry
AU - Schulz, Martin
AU - de Supinski, Bronis R.
N1 - Funding Information:
Part of this work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under contract DE-AC52-07NA27344 (LLNL-CONF-667408).
Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - As the HPC community attempts to reach exascale performance, power will be one of the most critical constrained resources. Achieving practical exascale computing will therefore rely on optimizing performance subject to a power constraint. However, this additional complication should not add to the burden of application developers; optimizing the run-time environment given restricted power will primarily be the job of high-performance system software. This paper introduces Conductor, a run-time system that intelligently distributes available power to nodes and cores to improve performance. The key techniques used are configuration space exploration and adaptive power balancing. Configuration exploration dynamically selects the optimal thread concurrency level and DVFS state subject to a hardwareenforced power bound. Adaptive power balancing efficiently determines where critical paths are likely to occur so that more power is distributed to those paths. Greater power, in turn, allows increased thread concurrency levels, the DVFS states, or both. We describe these techniques in detail and show that, compared to the state-of-the-art technique of using statically predetermined, per-node power caps, Conductor leads to a best-case performance improvement of up to 30%, and average improvement of 19.1%.
AB - As the HPC community attempts to reach exascale performance, power will be one of the most critical constrained resources. Achieving practical exascale computing will therefore rely on optimizing performance subject to a power constraint. However, this additional complication should not add to the burden of application developers; optimizing the run-time environment given restricted power will primarily be the job of high-performance system software. This paper introduces Conductor, a run-time system that intelligently distributes available power to nodes and cores to improve performance. The key techniques used are configuration space exploration and adaptive power balancing. Configuration exploration dynamically selects the optimal thread concurrency level and DVFS state subject to a hardwareenforced power bound. Adaptive power balancing efficiently determines where critical paths are likely to occur so that more power is distributed to those paths. Greater power, in turn, allows increased thread concurrency levels, the DVFS states, or both. We describe these techniques in detail and show that, compared to the state-of-the-art technique of using statically predetermined, per-node power caps, Conductor leads to a best-case performance improvement of up to 30%, and average improvement of 19.1%.
UR - http://www.scopus.com/inward/record.url?scp=84978499980&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84978499980&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-20119-1_28
DO - 10.1007/978-3-319-20119-1_28
M3 - Conference article
AN - SCOPUS:84978499980
SN - 0302-9743
VL - 9137 LNCS
SP - 394
EP - 408
JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
T2 - 30th International Conference on High Performance Computing, ISC 2015
Y2 - 12 July 2015 through 16 July 2015
ER -