Online Multi-Object Tracking with Pose-Guided Object Location and Dual Self-Attention Network

2021 
The recent trend in Multi-Object Tracking (MOT) is heading towards using deep learning to detect objects and extract features. Although tracking frameworks using detection network have achieved outstanding performance in object locating on MOT, it is still challenging for crowded occlusion. In this paper, we propose to alleviate this difficulty by combining bounding boxes from outputs of both object detection and pose estimation. The motivation behind generating redundant candidates is that object detection and pose estimation can complement each other in tracking scenes. In order to get optimal tracking objects from candidates, we present Soft-Pose-NMS. For similarity calculation, we design a Dual Self-Attention Network (DSAN) with the self-attention mechanism. The network generates the self-attention map that enables the network to focus on the object area of detection and tracklet images. Simultaneously, the network can extract the temporal self-attention feature map to suppress noisy images in the tracklet. Experiments are conducted on the MOT benchmark datasets. Results show that our tracker achieves competitive results and is state-of-the-art in half of the metrics.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    31
    References
    0
    Citations
    NaN
    KQI
    []