TY - GEN
T1 - Using focused regression for accurate time-constrained scaling of scientific applications
AU - Barnes, Brad
AU - Garren, Jeonifer
AU - Lowenthal, David K.
AU - Reeves, Jaxk
AU - De Supinski, Bronis R.
AU - Schulz, Martin
AU - Rountree, Barry
PY - 2010
Y1 - 2010
N2 - Many large-scale clusters now have hundreds of thousands of processors, and processor counts will be over one million within a few years. Computational scientists must scale their applications to exploit these new clusters. Time-constrained scaling, which is often used, tries to hold total execution time constant while increasing the problem size along with the processor count. However, complex interactions between parameters, the processor count, and execution time complicate determining the input parameters that achieve this goal. In this paper we develop a novel gray-box, focused regression-based approach that assists the computational scientist with maintaining constant run time on increasing processor counts. Combining application-level information from a small set of training runs, our approach allows prediction of the input parameters that result in similar per-processor execution time at larger scales. Our experimental validation across seven applications showed that median prediction errors are less than 13%.
AB - Many large-scale clusters now have hundreds of thousands of processors, and processor counts will be over one million within a few years. Computational scientists must scale their applications to exploit these new clusters. Time-constrained scaling, which is often used, tries to hold total execution time constant while increasing the problem size along with the processor count. However, complex interactions between parameters, the processor count, and execution time complicate determining the input parameters that achieve this goal. In this paper we develop a novel gray-box, focused regression-based approach that assists the computational scientist with maintaining constant run time on increasing processor counts. Combining application-level information from a small set of training runs, our approach allows prediction of the input parameters that result in similar per-processor execution time at larger scales. Our experimental validation across seven applications showed that median prediction errors are less than 13%.
UR - http://www.scopus.com/inward/record.url?scp=77954018761&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954018761&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2010.5470431
DO - 10.1109/IPDPS.2010.5470431
M3 - Conference contribution
AN - SCOPUS:77954018761
SN - 9781424464432
T3 - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010
BT - Proceedings of the 2010 IEEE International Symposium on Parallel and Distributed Processing, IPDPS 2010
T2 - 24th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2010
Y2 - 19 April 2010 through 23 April 2010
ER -