Although users of high-performance computing are most interested in raw performance, both energy and power consumption have become critical concerns. As a result, improving energy efficiency of nodes on HPC machines has become important, and the prevalence of power-scalable clusters, where the frequency and voltage can be dynamically modified, has increased. On power-scalable clusters, one opportunity for saving energy with little or no loss of performance exists when the computational load is not perfectly balanced. This situation occurs frequently, as keeping the load balanced between nodes is one of the long-standing fundamental problems in parallel and distributed computing. Indeed, despite the large body of research aimed at balancing load both statically and dynamically, this problem is quite difficult to solve. This paper presents a system called Jitter that reduces the frequency and voltage on nodes that are assigned less computation and, therefore, have idle or slack time. This saves energy on these nodes, and the goal of Jitter is to attempt to ensure that they arrive "just in time" so that they avoid increasing overall execution time. Specifically, we dynamically determine which nodes have enough slack time such that they can execute at a reduced frequency with little performance cost-which will greatly reduce the consumed energy on that node. In particular, Jitter saves 12.8% energy with 0.4% time increase-which is essentially the same as a hand-tuned solution-on the Aztec benchmark.
- Distributed computing
- Message passing interface (MPI)
ASJC Scopus subject areas
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence