Exploring Warp Criticality in Near-Threshold GPGPU Applications Using a Dynamic Choke Point Analysis

2019 
General-purpose graphics processing units (GPGPUs), due to their enormous parallelism, have found ubiquitous applications in parallel computing. However, their peak power rating has also increased over the years. As a consequence, near-threshold computing (NTC) has come to the rescue. However, a severe device-level delay variability arising from process variation (PV) can significantly diminish the NTC system performance. In this article, we examine choke points—a unique device-level characteristic of PV at NTC—that can exacerbate the delays of the GPGPU parallel warps. In order to improve the NTC GPU performance, we propose a family of holistic circuit-architectural solutions, referred to as choke-point-aware warp speculator (CPAWS). CPAWS identifies the choke point-induced critical warps in GPGPU applications and improves their execution latencies. Compared to a state-of-the-art warp scheduling policy, our best scheme improves the performance and energy efficiency of an NTC GPU by ~39% and ~31%, respectively.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    26
    References
    1
    Citations
    NaN
    KQI
    []