Boosting semi-supervised face recognition with raw faces

2022 
Deep facial recognition benefits significantly from large-scale training data; however, the bottleneck of high labeling costs persists. Therefore, to reduce the labeling costs, it is desirable to train a model using limited labeled data and abundant unlabeled data (, semi-supervised learning). However, existing semi-supervised learning methods present two primary challenges: (1) The possibility of identity overlaps between the unlabeled and labeled data. These overlaps can affect the correctness of pseudo-labels of the unlabeled set. (2) Different pseudo-labels generated by the clustering algorithm may belong to the same individual (, over-decomposition problem). Thus, in this study, instead of experimenting with non-overlapping conditions, we apply smooth labels to exploit the potential of those samples that are similar to the identities in the labeled set. For samples that are not similar to the labeled set, we introduce a dual clustering strategy to remedy the over-decomposition problem caused by single clustering. With the upgraded semi-supervised framework, we recycle the discarded samples during purification of MS-Celeb-1 M (MS1M) to further scale up the training set, which offers a considerable performance boost of 94.39% on the IJB-C dataset.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []