Inverted Residual Siamese Visual Tracking With Feature Crossing Network

2021 
Siamese networks based visual tracking has recently drawn great attention due to their superior representation and tracking accuracy. However, the backbone networks and prediction networks still cannot fully take advantage of features from modern deep networks. In this paper, we propose an inverted residual Siamese feature-crossing network (IRSiamese-FCN) which is end-to-end trained off-line with a large amount of image pairs. Specifically, the Siamese backbone networks for feature extraction consist of an inverted residual network and a feature-crossing network (FCN). The designed IR architecture is light weighted by combination of depthwise and pointwise convolutions. Moreover, non-linearities and linearities are proceeded separately in deep and narrow layers. Feature-crossing network is to perform feature-level aggregations, which makes deep and shallow layers complement each other more closely and further improves tracking accuracy. We conduct ablation studies and comparison experiments over five large benchmarks. The results demonstrate that the proposed tracker can achieve competitive performance.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    0
    Citations
    NaN
    KQI
    []