Efficient Defense Against Adversarial Attacks and Security Evaluation of Deep Learning System

2020 
Deep neural networks (DNNs) have achieved performance on classical artificial intelligence problems including visual recognition, natural language processing. Unfortunately, recent studies show that machine learning models are suffering from adversarial attacks, resulting in incorrect outputs in the form of purposeful distortions to inputs. For images, such subtle distortions are usually hard to be perceptible, yet they successfully fool machine learning models. In this paper, we propose a strategy, FeaturePro, for defending machine learning models against adversarial examples and evaluating the security of deep learning system. We tackle this challenge by reducing the visible feature space for adversary. By performing white-box attacks, black-box attacks, targeted attacks and non-targeted attacks, the security of deep learning algorithms which is an important indicator for evaluating artificial intelligence systems can be evaluated. We analyzed the generalization and robustness when it is composed with adversarial training. FeaturePro has efficient defense against adversarial attacks with a high accuracy and low false positive rates.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    18
    References
    0
    Citations
    NaN
    KQI
    []