TY - GEN
T1 - Application-Specific Autonomic Cache Tuning for General Purpose GPUs
AU - Gianelli, Sam
AU - Richter, Edward
AU - Jimenez, DIego
AU - Valdez, Hugo
AU - Adegbija, Tosiron
AU - Akoglu, Ali
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/10/9
Y1 - 2017/10/9
N2 - Cache tuning has been widely studied in CPUs, and shown to achieve substantial energy savings, with minimal performance degradations. However, cache tuning has yet to be explored in General Purpose Graphics Processing Units (GPGPU), which have emerged as efficient alternatives for general purpose high-performance computing. In this paper, we explore autonomic cache tuning for GPGPUs, where the cache configurations (cache size, line size, and associativity) can be dynamically specialized/tuned to the executing applications' resource requirements. We investigate cache tuning for both the level one (L1) and level two (L2) caches to derive insights into which cache level offers maximum optimization benefits. To illustrate the optimization potentials of autonomic cache tuning in GPGPUs, we implement a tuning heuristic that can dynamically determine each application's best L1 data cache configurations during runtime. Our results show that application-specific autonomic L1 data cache tuning can reduce the average energy delay product (EDP) and improve the performance by 16.5% and 18.8%, respectively, as compared to a static cache.
AB - Cache tuning has been widely studied in CPUs, and shown to achieve substantial energy savings, with minimal performance degradations. However, cache tuning has yet to be explored in General Purpose Graphics Processing Units (GPGPU), which have emerged as efficient alternatives for general purpose high-performance computing. In this paper, we explore autonomic cache tuning for GPGPUs, where the cache configurations (cache size, line size, and associativity) can be dynamically specialized/tuned to the executing applications' resource requirements. We investigate cache tuning for both the level one (L1) and level two (L2) caches to derive insights into which cache level offers maximum optimization benefits. To illustrate the optimization potentials of autonomic cache tuning in GPGPUs, we implement a tuning heuristic that can dynamically determine each application's best L1 data cache configurations during runtime. Our results show that application-specific autonomic L1 data cache tuning can reduce the average energy delay product (EDP) and improve the performance by 16.5% and 18.8%, respectively, as compared to a static cache.
KW - GPGPU
KW - GPU cache management
KW - Graphics processing unit
KW - adaptable hardware
KW - cache memories
KW - cache tuning
KW - configurable caches
KW - high performance computing
KW - low-power design
KW - low-power embedded systems
UR - http://www.scopus.com/inward/record.url?scp=85035325647&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85035325647&partnerID=8YFLogxK
U2 - 10.1109/ICCAC.2017.17
DO - 10.1109/ICCAC.2017.17
M3 - Conference contribution
AN - SCOPUS:85035325647
T3 - Proceedings - 2017 IEEE International Conference on Cloud and Autonomic Computing, ICCAC 2017
SP - 104
EP - 113
BT - Proceedings - 2017 IEEE International Conference on Cloud and Autonomic Computing, ICCAC 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th IEEE International Conference on Cloud and Autonomic Computing, ICCAC 2017
Y2 - 18 September 2017 through 22 September 2017
ER -