Countermeasure against backdoor attacks using epistemic classifiers

2020 
In machine learning, backdoor or trojan attacks during model training can cause the targeted model to deceptively learn to misclassify in the presence of specific triggers. This mechanism of deception enables the attacker to exercise full control on when the model behavior becomes malicious through use of a trigger. In this paper, we introduce Epistemic Classifiers as a new category of defense mechanism and show their effectiveness in detecting backdoor attacks, which can be used to trigger default mechanisms, or solicit human intervention, on occasions where an untrustworthy model prediction can adversely impact the system within which it operates. We show experimental results with multiple public datasets and explain the reasons with visualization for effectiveness of the proposed approach. This empowers the war fighter to trust the AI on the tactical edge to be reliable and to become sensitive to scenarios with deception and noise where reliability cannot be provided.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []