Defect Analysis and Cost-Effective Resilience Architecture for Future DRAM Devices

2017 
Technology scaling has continuously improved the density, performance, energy efficiency, and cost of DRAM-based main memory systems. Starting from sub-20nm processes, however, the industry began to pay considerably higher costs to screen and manage notably increasing defective cells. The traditional technique, which replaces the rows/columns containing faulty cells with spare rows/columns, has been able to cost-effectively repair the defective cells so far, but it will become unaffordable soon because an excessive number of spare rows/columns are required to manage the increasing number of defective cells. This necessitates a synergistic application of an alternative resilience technique such as In-DRAM ECC with the traditional one. Through extensive measurement and simulation, we first identify that aggressive miniaturization makes DRAM cells more sensitive to random telegraph noise or variable retention time, which is dominantly manifested as a surge in randomly scattered single-cell faults. Second, we advocate using In-DRAM ECC to overcome the DRAM scaling challenges and architect In-DRAM ECC to accomplish high area efficiency and minimal performance degradation. Moreover, we show that advancement in process technology reduces decoding/correction time to a small fraction of DRAM access time, and that the throughput penalty of a write operation due to an additional read for a parity update is mostly overcome by the multi-bank structure and long burst writes that span an entire In-DRAM ECC codeword. Lastly, we demonstrate that system reliability with modern rank-level ECC schemes such as single device data correction is further improved by hundred million times with the proposed In-DRAM ECC architecture.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    36
    Citations
    NaN
    KQI
    []