Unsupervised Deep Hyperspectral Video Target Tracking and High Spectral-Spatial-Temporal Resolution (H³ Benchmark Dataset

2021 
Target tracking has received increased attention in the past few decades. However, most of the target tracking algorithms are based on RGB video data, and few are based on hyperspectral video data. With the development of the new ``snapshot'' hyperspectral sensors, hyperspectral videos can now be easily obtained. However, hyperspectral video target tracking datasets are still rare. In this article, a high spectral-spatial-temporal resolution hyperspectral video target tracking algorithm framework (H³Net) based on deep learning is proposed. The proposed framework consists of two main parts: 1) an unsupervised deep learning-based target tracking training framework for hyperspectral video; and 2) a dual-branch network structure based on a Siamese network. Using the dual-branch network, the H³Net framework can utilize both the spatial and spectral information. The combination of deep learning and a discriminative correlation filter (DCF) makes the features extracted by deep learning more suitable for the DCF. Compared with hyperspectral images, hyperspectral video data require more manpower to annotate, so we propose an unsupervised approach to train H³Net, without any annotation. To solve the problem of the lack of hyperspectral video datasets, we built a 25-band hyperspectral video dataset (the high spectral-spatial-temporal resolution hyperspectral video dataset: the WHU-Hi-H³ dataset) for target tracking. The experimental results obtained with the WHU-Hi-H³ dataset confirm the potential of unsupervised deep learning in hyperspectral video target tracking.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    2
    Citations
    NaN
    KQI
    []