C-Testing and Efficient Fault Localization for AI Accelerators*

2021 
Accelerators for machine learning (AI) inferencing applications are homogeneous designs composed of identical cores. Each core, or processing element (PE), contains multiply-and-accumulate units, control logic, and registers for storing and forwarding weights and activations. Testing homogeneous array-based AI accelerator chips by running automatic test pattern generation (ATPG) at the array level results in a high CPU time and pattern count. We propose a constant-testable (C-testable) method for test generation at the PE level such that the ATPG effort does not increase with the number of PEs. Our results show that, compared to the traditional array-level testing, the proposed method achieves up to 4.2× (3.5×), 1530× (2388×), and 170× (142×) reduction in the test pattern count, ATPG runtime, and test cycle count, respectively, for stuck-at (transition) faults in a 256×256 array, while preserving the test coverage. A reconfigurable scan architecture is introduced to enable the proposed C-testable solution for the entire accelerator array. The design-space exploration of a hierarchical test-compaction framework is presented. We also describe four debug solutions for fault localization and diagnosis.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    4
    Citations
    NaN
    KQI
    []