Adversarial Attacks and Defense on Deep Learning Classification Models using YC b C r Color Images

2021 
Deep neural network models are vulnerable to adversarial perturbations that are subtle but change the model predictions. Adversarial perturbations are generally computed for RGB images and are, hence, equally distributed among the RGB channels. We show, for the first time, that adversarial perturbations prevail in the Y-channel of the $\mathbf{YC}_{b}\mathbf{C}_{r}$ > color space and exploit this finding to propose a defense mechanism. Our defense ResUpNet, which is end-to-end trainable, removes perturbations only from the Y-channel by exploiting ResNet features in a bottleneck free up-sampling framework. The refined Y-channel is combined with the untouched $\mathbf{C}_{b}\mathbf{C}_{r}$ -channels to restore the clean image. We compare ResUpNet to existing defenses in the input transformation category and show that it achieves the best balance between maintaining the original accuracies on clean images and defense against adversarial attacks. Finally, we show that for the same attack and fixed perturbation magnitude, learning perturbations only in the Y-channel results in higher fooling rates. For example, with a very small perturbation magnitude $\epsilon=0.002$ ) the fooling rates of FGSM and PGD attacks on the ResNet50 model increase by 11.1% and 15.6% respectively, when the perturbations are learned only for the Y-channel.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    0
    Citations
    NaN
    KQI
    []