Selective, Structural, Subtle: Trilinear Spatial-Awareness for Few-Shot Fine-Grained Visual Recognition

2021 
Few-shot learning aims to recognize the novel categories from a few examples. However, most of the existing approaches usually focus on general image classification and fail to handle subtle differences between images. To alleviate this issue, we propose a trilinear spatial-awareness network for few-shot-grained visual recognition, called S3Net, which is composed of a spatial selection module, structural pyramid descriptor, and subtle difference mining module. Specifically, we first build the global relation to strengthen the features by spatial selection module. The structural pyramid descriptor then constructs a multi-scale representation for enhancing the rich contextual information by exploiting different receptive fields in the same feature layer. Furthermore, a similarity loss based on local descriptors and a global classification loss is design to help the network learn discrimination capability by handling subtle differences in confused or near-duplicated samples. Extensive experiments on 4 few-shot fine-grained benchmarks demonstrate that our proposed approach is effective and outperforms state-of-the-art models by large margins.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    0
    Citations
    NaN
    KQI
    []