Semantic video segmentation with dynamic keyframe selection and distortion-aware feature rectification

2021 
Abstract The per-frame segmentation methods have a high computational cost, thereby, these methods are insufficient to cope with the fast inference need of semantic video segmentation. To efficaciously reuse the extracted features by feature propagation, in this paper, we present distortion-aware feature rectification and online selection of keyframes for fast and accurate video segmentation. The proposed dynamic keyframe scheduling scheme is based on the extent of temporal variations using reinforcement learning. We employ policy gradient reinforcement strategy to learn policy function for maximizing the expected reward. The policy network has two actions (key and non-key) in the action space. State information is derived from the element-wise difference frame of the current frame and the warped current frame generated by the propagated previous frame. Afterward, an adaptive partial feature rectification with distortion-aware corrections is performed for the warped features of the non-key frames. Precise feature propagation is a critical task to uphold the temporal updates in the video sequence since it enormously affects the accuracy as well as the throughput of the whole video analysis framework. The distorted feature maps are revised with the light-weight feature extractor by the guidance of the distortion map while the correctly propagated features are not influenced. Deep feature flow approach is adopted for feature propagation. We evaluate our scheme on the Cityscapes and CamVid datasets with DeepLabv3 as segmentation network and LiteFlowNet for computing flow fields. Experimental results show that the proposed method outperforms the previous state-of-the-art methods significantly both in terms of accuracy and throughput.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    32
    References
    0
    Citations
    NaN
    KQI
    []