A sharing-aware L1.5D cache for data reuse in GPGPUs

2019 
With GPUs heading towards general-purpose, hardware caching, e.g. the first-level data (L1D) cache is introduced into the on-chip memory hierarchy for GPGPUs. However, facing the GPGPU massive multi-threading, the small L1D requires a better management for a higher hit rate to benefit the performance. In this paper, on observing the L1D usage inefficiency, such as data duplication among streaming multiprocessors (SMs) that wastes the precious L1D resources, we first propose a shared L1.5D cache that substitutes the private L1D caches in several SMs to reduce the duplicated data and in turn increase the effective cache size for each SM. We evaluate and adopt a suitable layout of L1.5D to meet the timing requirements in GPGPUs. Then, to protect the sharable data from early evictions, we propose a sharable data aware cache management, which leverages a lightweight PC-based history table to protect sharable data on cache replacement. The experiments demonstrate that the proposed design can achieve an averaged 20.1% performance improvement with an increased on-chip hit rate by 16.9% for applications with sharable data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    2
    Citations
    NaN
    KQI
    []