TY - GEN
T1 - A Study of STT-RAM-based In-Memory Computing Across the Memory Hierarchy
AU - Gajaria, Dhruv
AU - Antony Gomez, Kevin
AU - Adegbija, Tosiron
N1 - Funding Information:
ACKNOWLEDGMENT This work was partly supported by the National Science Foundation (NSF) under grant CNS-1844952. Any views expressed in this material are those of the authors and not necessarily of the NSF.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - In-memory computing (or processing in memory) is a promising approach to reducing the data transfer bottleneck in computer systems by bringing computation closer to the memory. Prior work proposed using Spin-Transfer Torque RAM (STT-RAM) for in-memory computing to leverage STT-RAM's numerous advantages, including non-volatility, near-zero leakage power, high area density, better endurance than other non-volatile memory technologies and demonstrated commercial viability. This paper explores, for the first time, the tradeoffs of STT-RAM in-memory computing across the memory hierarchy, including the main memory and cache hierarchy. We explore a system model in which processing in memory (PiM) occurs in non-volatile STT-RAM, whereas processing in cache (PiC) occurs in relaxed retention (volatile) STT-RAM. In relaxed retention STT-RAM caches, the retention time - the duration for which the STT-RAM cell retains data - is significantly reduced to mitigate STT-RAM's intrinsic write latency and write energy overheads. Importantly, we also analyze the tradeoffs and overheads of data movement for PiC vs. write overheads for PiM for STT-RAMs. The analysis is performed in the context of different kinds of workloads to explore the impacts of various workload characteristics (e.g., temporal locality, computational intensity, CPU-dependent workloads with limited instruction-level parallelism) on PiC/PiM tradeoffs. Using these workloads, we also evaluate computing in STT-RAM vs. SRAM at different levels of the cache hierarchy. Our analysis reveals that STT-RAM-based PiC has promising advantages over PiM in certain workload contexts and offers solutions to some of the challenges that arise in implementing PiC-enabled systems.
AB - In-memory computing (or processing in memory) is a promising approach to reducing the data transfer bottleneck in computer systems by bringing computation closer to the memory. Prior work proposed using Spin-Transfer Torque RAM (STT-RAM) for in-memory computing to leverage STT-RAM's numerous advantages, including non-volatility, near-zero leakage power, high area density, better endurance than other non-volatile memory technologies and demonstrated commercial viability. This paper explores, for the first time, the tradeoffs of STT-RAM in-memory computing across the memory hierarchy, including the main memory and cache hierarchy. We explore a system model in which processing in memory (PiM) occurs in non-volatile STT-RAM, whereas processing in cache (PiC) occurs in relaxed retention (volatile) STT-RAM. In relaxed retention STT-RAM caches, the retention time - the duration for which the STT-RAM cell retains data - is significantly reduced to mitigate STT-RAM's intrinsic write latency and write energy overheads. Importantly, we also analyze the tradeoffs and overheads of data movement for PiC vs. write overheads for PiM for STT-RAMs. The analysis is performed in the context of different kinds of workloads to explore the impacts of various workload characteristics (e.g., temporal locality, computational intensity, CPU-dependent workloads with limited instruction-level parallelism) on PiC/PiM tradeoffs. Using these workloads, we also evaluate computing in STT-RAM vs. SRAM at different levels of the cache hierarchy. Our analysis reveals that STT-RAM-based PiC has promising advantages over PiM in certain workload contexts and offers solutions to some of the challenges that arise in implementing PiC-enabled systems.
KW - in-cache computing
KW - in-memory computing
KW - relaxed retention time
KW - STT-RAM
UR - http://www.scopus.com/inward/record.url?scp=85145875833&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85145875833&partnerID=8YFLogxK
U2 - 10.1109/ICCD56317.2022.00105
DO - 10.1109/ICCD56317.2022.00105
M3 - Conference contribution
AN - SCOPUS:85145875833
T3 - Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors
SP - 685
EP - 692
BT - Proceedings - 2022 IEEE 40th International Conference on Computer Design, ICCD 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 40th IEEE International Conference on Computer Design, ICCD 2022
Y2 - 23 October 2022 through 26 October 2022
ER -