Deep Learning for Video Face Recognition

2021 
This chapter is concerned with face recognition based on videos or, more generally, sets of images, using deep learning techniques. We first briefly review some naive yet commonly used strategies pertaining to using frame-level features extracted by deep convolutional neural networks (CNNs) for video-level face recognition. Representative strategies include naive feature pooling and pairwise feature distance computation. Then, we present a method named neural aggregation network (NAN), which is a deep learning framework tailored for video-based representation and recognition. NAN can automatically learn the quality of faces in a video/image set and aggregate the frame-level deep features accordingly, yielding more discriminative video-level features. We conduct experimental evaluation on three video face recognition datasets. The results indicate that while previous deep learning-based methods with naive pooling or pairwise distances have obtained substantial improvements over traditional methods, the NAN method further outperforms them by an appreciable margin.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    0
    Citations
    NaN
    KQI
    []