language-icon Old Web
English
Sign In

On semantic-instructed attention

2015 
Visual attention influenced by example images and predefined targets are widely studied in both cognitive and computer vision fields. Nevertheless, semantics, known to be related to high-level human perception, have a great influence on top-down attention process. Understanding the impact of semantics on visual attention is beneficial for providing psychological and computational guidance on many real-world applications, e.g., semantic video retrieval. In this paper, we intend to study the mechanisms of attention control and computational modeling of saliency detection for dynamic scenes under semantic-instructed viewing conditions. We start our study by establishing a dataset REMoT, the first video eye-tracking dataset with semantic instructions to our best knowledge. We collect the fixation locations of subjects when they are given four kinds of instructions with different levels of noise. The fixation behavior analysis on REMoT shows that the process of semantic-instructed attention can be explained with long-term memory and short-term memory of human visual system. Inspired by this finding, we propose a memory-guided probabilistic model to exploit the semantic-instructed top-down attention. The experience of attention distribution to similar scenes in long-term memory is simulated by linear mapping of global scene features. An HMM-like conditional probabilistic chain is constructed to model the dynamic fixation patterns among neighboring frames in short-term memory. Then, a generative saliency model is constructed which probabilistically combines the top-down and a bottom-up modules for semantic-instructed saliency detection. We compare our model to state-of-the-art models on REMoT and a widely used dataset RSD. Experimental results show that our model achieves significant improvements not only in predicting visual attention under correct instructions, but also in detecting saliency for free viewing.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    54
    References
    4
    Citations
    NaN
    KQI
    []