Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network

2020 
Abstract Skeleton-based action recognition aims to recognize human actions by exploring the inherent characteristics from the given skeleton sequences and has attracted far more attention due to its great important potentials in practical applications. Previous methods have illustrated that learning discriminative spatial and temporal features from the skeleton sequences is a crucial factor to recognize human actions. Nevertheless, how to model spatio-temporal evolutions is still a challenging problem. In this work, we propose a novel model with hierarchical spatial reasoning and temporal stack learning network (HSR-TSL) to explore the discriminative spatial and temporal features for human action recognition, which consists of a hierarchical spatial reasoning network (HSRN) and a temporal stack learning network (TSLN). Specifically, the HSRN employs a hierarchical residual graph neural network to capture two-level spatial features: intra spatial information of each part and body-level structural information between each part. The TSLN models the detailed temporal dynamics of skeleton sequences by a composition of multiple skip-clip LSTMs. During training, we develop a clip-based incremental loss to effectively optimize the model. We perform extensive experiments on five challenging benchmarks to verify the effectiveness of each component of our model. The comparison results illustrate that our approach significantly boosts the performances for skeleton-based action recognition.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    68
    References
    20
    Citations
    NaN
    KQI
    []