TY - JOUR
T1 - Parallel implementation of the irregular terrain model (ITM) for radio transmission loss prediction using GPU and cell BE processors
AU - Song, Yang
AU - Akoglu, Ali
N1 - Funding Information:
This work was funded in part by the Sensor Visualization and Data Fusion Workstation Program of Battle Command Battle Lab-Huachuca under Contract No. W15P7T-07-CP046. The authors would like to thank Dr. Paul Cook, Jeffery A. Rudin, and Gregory M. Striemer for their inputs throughout the project.
PY - 2011
Y1 - 2011
N2 - The Irregular Terrain Model (ITM), also known as the Longley-Rice model, predicts long-range average transmission loss of a radio signal based on atmospheric and geographic conditions. Due to variable terrain effects and constantly changing atmospheric conditions which can dramatically influence radio wave propagation, there is a pressing need for computational resources capable of running hundreds of thousands of transmission loss calculations per second. Multicore processors, like the NVIDIA Graphics Processing Unit (GPU) and IBM Cell Broadband Engine (BE), offer improved performance over mainstream microprocessors for ITM. We study architectural features of the Tesla C870 GPU and Cell BE and evaluate the effectiveness of architecture-specific optimizations and parallelization strategies for ITM on these platforms. We assess the GPU implementations that utilize both global and shared memories along with fine-grained parallelism. We assess the Cell BE implementations that utilize direct memory access, double buffering, and SIMDization. With these optimization strategies, we achieve less than a second of computation time on each platform which is not feasible with a general purpose processor, and we observe that the GPU delivers better performance than Cell BE in terms of total execution time and performance per watt metrics by a factor of 2.3x and 1.6x, respectively.
AB - The Irregular Terrain Model (ITM), also known as the Longley-Rice model, predicts long-range average transmission loss of a radio signal based on atmospheric and geographic conditions. Due to variable terrain effects and constantly changing atmospheric conditions which can dramatically influence radio wave propagation, there is a pressing need for computational resources capable of running hundreds of thousands of transmission loss calculations per second. Multicore processors, like the NVIDIA Graphics Processing Unit (GPU) and IBM Cell Broadband Engine (BE), offer improved performance over mainstream microprocessors for ITM. We study architectural features of the Tesla C870 GPU and Cell BE and evaluate the effectiveness of architecture-specific optimizations and parallelization strategies for ITM on these platforms. We assess the GPU implementations that utilize both global and shared memories along with fine-grained parallelism. We assess the Cell BE implementations that utilize direct memory access, double buffering, and SIMDization. With these optimization strategies, we achieve less than a second of computation time on each platform which is not feasible with a general purpose processor, and we observe that the GPU delivers better performance than Cell BE in terms of total execution time and performance per watt metrics by a factor of 2.3x and 1.6x, respectively.
KW - IBM cell broadband engine
KW - Longley-Rice model
KW - NVIDIA GPU
KW - multicore
KW - parallel computing
UR - http://www.scopus.com/inward/record.url?scp=79959686681&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79959686681&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2011.21
DO - 10.1109/TPDS.2011.21
M3 - Article
AN - SCOPUS:79959686681
SN - 1045-9219
VL - 22
SP - 1276
EP - 1283
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 8
M1 - 5680900
ER -