Face Anti-spoofing via Adversarial Cross-modality Translation

2021 
Face Presentation Attack Detection (PAD) approaches based on multi-modal data have been attracted increasingly by the research community. However, they require multi-modal face data consistently involved in both the training and testing phases. It would severely limit the applicability due to the most Face Anti-spoofing (FAS) systems are only equipped with Visible (VIS) imaging devices, i.e. , RGB cameras. Therefore, how to use other modality (i.e., Near-Infrared (NIR)) to assist the performance improvement of VIS-based PAD is significant for FAS. In this work, we first discuss the big gap of performances among different modalities even though the same backbone network is applied. Then, we propose a novel Cross-modal Auxiliary (CMA) framework for the VIS-based FAS task. The main trait of CMA is that the performance can be greatly improved with the help of other modality while no other modality is required in the testing stage. The proposed CMA consists of a Modality Translation Network (MT-Net) and a Modality Assistance Network (MA-Net). The former aims to close the visible gap between different modalities via a generative model that maps inputs from one modality ( i.e. , RGB) to another ( i.e. , NIR). The latter focuses on how to use the translated modality ( i.e. , target modality) and RGB modality ( i.e. , source modality) together to train a discriminative PAD model. Extensive experiments are conducted to demonstrate that the proposed framework can push the state-of-the-art (SOTA) performances on both multi-modal datasets ( i.e. , CASIA-SURF, CeFA, and WMCA) and RGB-based datasets ( i.e. , OULU-NPU, and SiW).
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    59
    References
    5
    Citations
    NaN
    KQI
    []