STA3D: spatiotemporally attentive 3D network for video saliency prediction

Wenbin Zou,Zhuo Shengkai,Yi Tang,Shishun Tian,Xia Li,Chen Xu

STA3D: spatiotemporally attentive 3D network for video saliency prediction

2021

Wenbin Zou
Zhuo Shengkai
Yi Tang
Shishun Tian
Xia Li
Chen Xu

Abstract 3D fully convolutional networks (FCN), which jointly leverage the spatial and temporal cues, have achieved great success in video saliency prediction. However, they still have limitations in some challenging cases, eg. fixation shift. To address this issue, we propose a SpatioTemporally Attentive 3D Network (STA3D) to selectively propagate the significant temporal features and refine the spatial features in 3D FCN for video saliency prediction. Extensive experiments on three standard datasets demonstrate the superiority of the proposed model against the state-of-the-art.

Keywords:

Fixation (psychology)
Pattern recognition
Artificial intelligence
Leverage (statistics)
Computer science

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations