Vision-based Human Action Recognition on Pre-trained AlexNet

2019 
The Deep learning analysis has been extensively carried out in the context of object/ pattern recognition due to its excellence in feature extraction and classification. However, the superior performance just can be guaranteed with the availability of huge amounts of training data and also high-specification data processing unit to process the data deeper at high speeds. Hence, another alternative is by applying transfer learning. In transfer learning, a neural network model is first trained on a data similar to the targeted data. With that, the knowledge such as features, weights etc. could be leveraged from the trained model to train the new model. In this project, a vision-based human action recognition via a transfer learning is conducted. Specifically, in the proposed approach, the earlier layers of a pre-trained AlexNet is preserved since those extracted low-level features are characterizing generic features which are common to most data. However, the pre-train network is fine-tuned based on our interested data, that is human action data. Since AlexNet requires input data of size 227*227*3, the frames of each video are processed into 3 different templates. The three computed templates are: (1) Motion History Image carrying spatio-temporal information, (2) Binary Motion Energy Image incorporating motion region information and (3) optical flow template holding accumulative motion speed information. The proposed approach is validated on two publicly available databases, which are Weizmann database and KTH database. From the empirical results, a promising performance is obtained with about 90% accuracy from the databases.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    3
    Citations
    NaN
    KQI
    []