A comprehensive solution for detecting events in complex surveillance videos

2019 
Event detection have long been a fundamental problem in computer vision society. Various datasets for recognizing human events and activities have been proposed to help developing better models and methods, such as UCF101, HMDB51, etc. These datasets all share the same properties that either predefined scripts are provided or the images are almost actor-oriented with little background noise. These properties, however, are completely different from that of surveillance event detection, making the effective solutions on these datasets totally not suitable. Event detection in complex surveillance video is a much more difficult task with several challenges: heavy occlusions between pedestrians, low image resolution and uncontrolled scene condition. TRECVID-SED evaluation, aiming at detecting events in highly crowded airport, is well-known for its great difficulties. To deal with event detection in realistic scene, such as TRECVID-SED, we introduce a comprehensive solution framework based on pedestrian detection, deep key-pose detection and trajectory analysis. Explicitly, instead of detecting whole body of one person, we detect the head-shoulder of pedestrian, addressing the issue of heavy occlusion of pedestrians in complex scene. We also propose a trajectory-based event detection method so as to better focus on the key actors of events. For those events with discriminative poses, we model the event detection as key pose detection by taking advantages of Faster R-CNN. The presented framework achieves the best result in TRECVID-SED 2016 evaluation.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    46
    References
    4
    Citations
    NaN
    KQI
    []