TY - GEN
T1 - Adaptive, transparent frequency and voltage scaling of communication phases in MPI programs
AU - Lim, Min Yeol
AU - Freeh, Vincent W.
AU - Lowenthal, David K.
PY - 2006
Y1 - 2006
N2 - Although users of high-performance computing are most interested in raw performance, both energy and power consumption have become critical concerns. Some microprocessors allow frequency and voltage scaling, which enables a system to reduce CPU performance and power when the CPU is not on the critical path. When properly directed, such dynamic frequency and voltage scaling can produce significant energy savings with little performance penalty.This paper presents an MPI runtime system that dynamically reduces CPU performance during communication phases in MPI programs. It dynamically identifies such phases and, without profiling or training, selects the CPU frequency in order to minimize energy-delay product. All analysis and subsequent frequency and voltage scaling is within MPI and so is entirely transparent to the application. This means that the large number of existing MPI programs, as well as new ones being developed, can use our system without modification. Results show that the average reduction in energy-delay product over the NAS benchmark suite is 10% - -the average energy reduction is 12% while the average execution time increase is only 2.1%.
AB - Although users of high-performance computing are most interested in raw performance, both energy and power consumption have become critical concerns. Some microprocessors allow frequency and voltage scaling, which enables a system to reduce CPU performance and power when the CPU is not on the critical path. When properly directed, such dynamic frequency and voltage scaling can produce significant energy savings with little performance penalty.This paper presents an MPI runtime system that dynamically reduces CPU performance during communication phases in MPI programs. It dynamically identifies such phases and, without profiling or training, selects the CPU frequency in order to minimize energy-delay product. All analysis and subsequent frequency and voltage scaling is within MPI and so is entirely transparent to the application. This means that the large number of existing MPI programs, as well as new ones being developed, can use our system without modification. Results show that the average reduction in energy-delay product over the NAS benchmark suite is 10% - -the average energy reduction is 12% while the average execution time increase is only 2.1%.
UR - http://www.scopus.com/inward/record.url?scp=34548243284&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34548243284&partnerID=8YFLogxK
U2 - 10.1145/1188455.1188567
DO - 10.1145/1188455.1188567
M3 - Conference contribution
AN - SCOPUS:34548243284
SN - 0769527000
SN - 9780769527000
T3 - Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC'06
BT - Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC'06
ER -