On Noise Stability and Robustness of Adversarially Trained Networks on NVM Crossbars

2022 
Applications based on deep neural networks (DNNs) have grown exponentially in the past decade. To match their increasing computational needs, several nonvolatile memory (NVM) crossbar-based accelerators have been proposed. Recently, researchers have shown that apart from improved energy efficiency and performance, such approximate hardware also possess intrinsic robustness for defense against adversarial attacks. Prior works have focused on quantifying this intrinsic robustness for vanilla networks, that is DNNs trained on unperturbed inputs. However, adversarial training of DNNs, i.e., training with adversarially perturbed images, is the benchmark technique for robustness, and sole reliance on intrinsic robustness of the hardware may not be sufficient. In this work, we explore the design of robust DNNs through the amalgamation of adversarial training and the intrinsic robustness offered by NVM crossbar-based analog hardware. First, we study the noise stability of such networks on unperturbed inputs and observe that internal activations of adversarially trained networks have lower signal-to-noise ratio (SNR), and are sensitive to noise compared to vanilla networks. As a result, they suffer significantly higher performance degradation due to the approximate computations on analog hardware; on an average $2\times $ accuracy drop. Noise stability analyses clearly show the instability of adversarially trained DNNs. On the other hand, for adversarial images generated using Square Black Box attacks, ResNet-10/20 adversarially trained on CIFAR-10/100 display a robustness improvement of 20%–30% under high $\epsilon _{\mathrm{ attack}}$ (degree of input perturbation). For adversarial images generated using projected-gradient-descent (PGD) White-Box attacks, the adversarially trained DNNs present a 5%–10% gain in robust accuracy due to the underlying NVM crossbar when $\epsilon _{\mathrm{ attack}}$ is greater than the epsilon of the adversarial training ( $\epsilon _{\mathrm{ train}}$ ). Our results indicate that implementing adversarially trained networks on analog hardware requires careful calibration between hardware nonidealities and $\epsilon _{\mathrm{ train}}$ to achieve optimum robustness and performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    43
    References
    0
    Citations
    NaN
    KQI
    []