This paper proposes SC++lite, a sequentially-consistent system that relaxes memory order speculatively to bridge the performance gap among memory consistency models. Prior proposals to speculatively relax memory order require large custom on-chip storage to maintain a history of speculative processor and memory state while memory order is relaxed. SC++lite uses the memory hierarchy to store the speculative history, providing a scalable path for speculative SC systems across a wide range of applications and system latencies. We use cycle-accurate simulation of shared-memory multiprocessors to show that SC++lite can fully relax memory order while virtually obviating the need for custom on-chip storage. Moreover, while demand for storage increases significantly with larger memory latencies, SC++lite's ability to relax memory order remains insensitive to memory latency. An SC++lite system can improve performance over a base SC system by 26% with only 1.7KB of custom storage in a system with 16 processors. In contrast, speculative SC systems with custom storage require 51.4KB of storage to improve performance by 29% over a base SC system.
|Original language||English (US)|
|Title of host publication||Journal of Instruction-Level Parallelism|
|State||Published - Apr 2003|
ASJC Scopus subject areas
- Information Systems
- Hardware and Architecture