Spatial-temporal Fusion Network with Residual Learning and Attention Mechanism: A Benchmark for Video-Based Group Re-ID

2019 
Video-based group re-identification (Re-ID) remains to be a meaningful task under rare study. Group Re-ID contains the information of the relationship between pedestrians, while the video sequences provide more frames to identify the person. In this paper, we propose a spatial-temporal fusion network for the group Re-ID. The network composes of the residual learning played between the CNN and the RNN in a unified network, and the attention mechanism which makes the system focus on the discriminative features. We also propose a new group Re-ID dataset DukeGroupVid to evaluate the performance of our spatial-temporal fusion network. Comprehensive experimental results on the proposed dataset and other video-based datasets, PRID-2011, i-LIDS-VID and MARS, demonstrate the effectiveness of our model.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    21
    References
    0
    Citations
    NaN
    KQI
    []