From Coarse to Fine: Hierarchical Structure-Aware Video Summarization

2021 
Hierarchical structure is a common characteristic of some kinds of videos (e.g., sports videos, game videos): the videos are composed of several actions hierarchically and there exists temporal dependencies among segments of different scales, where action labels can be enumerated. Our ideas are based on two intuition: First, the actions are the fundamental units for people to understand these videos. Second, the process of summarization is naturally one of observation and refinement, i.e., observing segments in video and hierarchically refining the boundaries of an important action according to video hierarchical structure. Based on above insights, we generate action proposals to exploit the structure and formulate the summarization process as a hierarchical refining process. We also train a hierarchical summarization network with deep Q-learning (HQSN) to achieve the refining process and explore temporal dependency. Besides, we collect a new dataset that consists of structured game videos with fine-grain actions and importance annotations. The experimental results demonstrate the effectiveness of our framework.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    38
    References
    1
    Citations
    NaN
    KQI
    []