BaFFLe: Backdoor Detection via Feedback-based Federated Learning

2021 
Recent studies have shown that federated learning (FL) is vulnerable to poisoning attacks that inject a backdoor into the global model. These attacks are effective even when performed by a single client, and undetectable by most existing defensive techniques. In this paper, we propose Backdoor detection via Feedback-based Federated Learning (BAFFLE), a novel defense to secure FL against backdoor attacks. The core idea behind BAFFLE is to leverage data of multiple clients not only for training but also for uncovering model poisoning. We exploit the availability of diverse datasets at the various clients by incorporating a feedback loop into the FL process, to integrate the views of those clients when deciding whether a given model update is genuine or not. We show that this powerful construct can achieve very high detection rates against state-of-the-art backdoor attacks, even when relying on straightforward methods to validate the model. Through empirical evaluation using the CIFAR-10 and FEMNIST datasets, we show that by combining the feedback loop with a method that suspects poisoning attempts by assessing the per-class classification performance of the updated model, BAFFLE reliably detects state-of-the-art backdoor attacks with a detection accuracy of 100% and a false-positive rate below 5%. Moreover, we show that our solution can detect adaptive attacks aimed at bypassing the defense.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    24
    References
    5
    Citations
    NaN
    KQI
    []