Countermeasure against backdoor attacks using epistemic classifiers

Zhaoyuan Yang,Nurali Virani,Naresh Sundaram Iyer

Countermeasure against backdoor attacks using epistemic classifiers

2020

In machine learning, backdoor or trojan attacks during model training can cause the targeted model to deceptively learn to misclassify in the presence of specific triggers. This mechanism of deception enables the attacker to exercise full control on when the model behavior becomes malicious through use of a trigger. In this paper, we introduce Epistemic Classifiers as a new category of defense mechanism and show their effectiveness in detecting backdoor attacks, which can be used to trigger default mechanisms, or solicit human intervention, on occasions where an untrustworthy model prediction can adversely impact the system within which it operates. We show experimental results with multiple public datasets and explain the reasons with visualization for effectiveness of the proposed approach. This empowers the war fighter to trust the AI on the tactical edge to be reliable and to become sensitive to scenarios with deception and noise where reliability cannot be provided.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations