Enhanced Memory Network for Video Segmentation

2019 
This paper proposes an Enhanced Memory Network (EMN) for semi-supervised video object segmentation. Space-Time Memory Networks has proven the effectiveness of the abundant use of guidance information. To further improve the accuracy of unknown and small targets, we propose to perform fined-grained segmentation based on the correlation attention map. We introduce a siamese network to obtain the semantic similarity and relevance between the tracking objects and the whole image. The feature map extracted from the siamese network on the cropped image is multiplied onto the whole feature map as the attention of proposal objects. Also, an ASPP module is employed to increase the semantic receptive filed to further improve the segmentation accuracy on different scale. Based on the multi-object combination and multi-scale ensemble, the proposed algorithm achieves the first place on the YouTube-VOS 2019 Semi-supervised Video Object Segmentation Challenge with a J&F mean score of 81.8%.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    8
    References
    9
    Citations
    NaN
    KQI
    []