Cache tuning has been widely studied in CPUs, and shown to achieve substantial energy savings, with minimal performance degradations. However, cache tuning has yet to be explored in General Purpose Graphics Processing Units (GPGPU), which have emerged as efficient alternatives for general purpose high-performance computing. In this paper, we explore autonomic cache tuning for GPGPUs, where the cache configurations (cache size, line size, and associativity) can be dynamically specialized/tuned to the executing applications' resource requirements. We investigate cache tuning for both the level one (L1) and level two (L2) caches to derive insights into which cache level offers maximum optimization benefits. To illustrate the optimization potentials of autonomic cache tuning in GPGPUs, we implement a tuning heuristic that can dynamically determine each application's best L1 data cache configurations during runtime. Our results show that application-specific autonomic L1 data cache tuning can reduce the average energy delay product (EDP) and improve the performance by 16.5% and 18.8%, respectively, as compared to a static cache.