End-to-end learning of deep convolutional neural network for 3D human action recognition

2017 
Recently, skeleton-based human action recognition has been receiving significant attention from various research communities due to its robustness, succinctness, and view-invariant representation. Most of the existing skeleton-based methods use either well-designed classifiers with hand-crafted features or current neural network (RNN) to recognize human actions. In this paper, inspired by the deep convolutional neural network's breakthroughs in the image domain, we transform a skeleton sequence into an image and perform end-to-end learning of deep convolutional neural network (CNN). The skeleton sequence based image contains spatial temporal information. Our proposed method is tested on the NTU RGB+D dataset which is so far the largest skeleton-based human action dataset, and achieves the state-of-the-art performance for both the cross-view and cross-subject evaluations.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    16
    Citations
    NaN
    KQI
    []