TY - GEN
T1 - STAR-MPI
T2 - 20th Annual International Conference on Supercomputing, ICS 2006
AU - Faraj, Ahmad
AU - Yuan, Xin
AU - Lowenthal, David
PY - 2006
Y1 - 2006
N2 - Message Passing Interface (MPI) collective communication routines are widely used in parallel applications. In order for a collective communication routine to achieve high performance for different applications on different platforms, it must be adaptable to both the system architecture and the application workload. Current MPI implementations do not support such software adaptability and are not able to achieve high performance on many platforms. In this paper, we present STAR-MPI (Self Tuned Adaptive Routines for MPI collective operations), a set of MPI collective communication routines that are capable of adapting to system architecture and application workload. For each operation, STAR-MPI maintains a set of communication algorithms that can potentially be efficient at different situations. As an application executes, a STAR-MPI routine applies the Automatic Empirical Optimization of Software (AEOS) technique at run time to dynamically select the best performing algorithm for the application on the platform. We describe the techniques used in STAR-MPI, analyze STAR-MPI overheads, and evaluate the performance of STAR-MPI with applications and benchmarks. The results of our study indicate that STAR-MPI is robust and efficient. It is able to and efficient algorithms with reasonable overheads, and it out-performs traditional MPI implementations to a large degree in many cases.
AB - Message Passing Interface (MPI) collective communication routines are widely used in parallel applications. In order for a collective communication routine to achieve high performance for different applications on different platforms, it must be adaptable to both the system architecture and the application workload. Current MPI implementations do not support such software adaptability and are not able to achieve high performance on many platforms. In this paper, we present STAR-MPI (Self Tuned Adaptive Routines for MPI collective operations), a set of MPI collective communication routines that are capable of adapting to system architecture and application workload. For each operation, STAR-MPI maintains a set of communication algorithms that can potentially be efficient at different situations. As an application executes, a STAR-MPI routine applies the Automatic Empirical Optimization of Software (AEOS) technique at run time to dynamically select the best performing algorithm for the application on the platform. We describe the techniques used in STAR-MPI, analyze STAR-MPI overheads, and evaluate the performance of STAR-MPI with applications and benchmarks. The results of our study indicate that STAR-MPI is robust and efficient. It is able to and efficient algorithms with reasonable overheads, and it out-performs traditional MPI implementations to a large degree in many cases.
UR - http://www.scopus.com/inward/record.url?scp=34248373234&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=34248373234&partnerID=8YFLogxK
U2 - 10.1145/1183401.1183431
DO - 10.1145/1183401.1183431
M3 - Conference contribution
AN - SCOPUS:34248373234
SN - 1595932828
SN - 9781595932822
T3 - Proceedings of the International Conference on Supercomputing
SP - 199
EP - 208
BT - Proceedings of the 20th Annual International Conference on Supercomputing, ICS 2006
Y2 - 28 June 2006 through 1 July 2006
ER -