Learnable Sparse Transform Siamese Attention Networks for Visual Tracking

2021 
Visual tracking based on Siamese Network formulates the tracking task as a matching problem between the target template and search regions. Most existing algorithms include offline training and online tracking stage, respectively. The corresponding deep features are extracted by Convolutional Neural Network (CNN). In these tracking algorithms, the spatial and channel-wise redundancy of features usually remain during the feature extraction, which limits the performance of the tracker. In this paper, we propose a Learnable Sparse Transform Siamese Networks based tracking algorithm, referred to as SiamLST, which consists of a feature extraction network with shared weights, a learnable sparse transformation model and a crosscorrelation operation for similarity matching. By transforming the features map into sparser domain, we obtain richness features map and reduce the redundancy of local features. In addition, the SiamLST has better ability to represent a target in complex situations, such as image noise or corruptions at the inference stage. A large number of experiments on OTB2015 illustrate that the proposed algorithm achieve excellent tracking performances.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    16
    References
    0
    Citations
    NaN
    KQI
    []