Revisiting Modality-Specific Feature Compensation for Visible-Infrared Person Re-Identification

2022 
Although modality-specific feature compensation becomes a prevailing paradigm for Visible-Infrared Person Re-Identification (VI-ReID) to learn features, it, performance-wise, is not promising, especially when compared to modality-shared feature learning. In this paper, by revisiting the modality-specific feature compensation based models, we reveal that the reasons for being under-performed are: (1) generated images of one modality from another modality may be poor in quality; (2) such existing models usually achieve the modality-specific feature compensation just via simple pixel-level fusion strategies; (3) generated images cannot fully replace corresponding missing ones, which brings in extra modality discrepancy. To address these issues, we propose a new Two-Stage Modality Enhancement Network (TSME) for VI-ReID. Concretely, it first considers the modality discrepancy for cross-modality style translation and optimizes the structures of image generators by involving a new Deeper Skip-connection Generative Adversarial Networks (DSGAN) to generate high-quality images. Then, it presents an attention mechanism based feature-level fusion module, i.e., Pair-wise Image Fusion (PwIF) module, and an auxiliary learning module, i.e., Invoking All-Images (IAI) module, to better exploit the generated and original images for reducing modality discrepancy from the perspectives of feature fusion and feature constraints, respectively. Comprehensive experiments are carried out to demonstrate the success of TSME in tackling the modality discrepancy issue exposed in VI-ReID.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    71
    References
    0
    Citations
    NaN
    KQI
    []