Training data reduction in deep neural networks with partial mutual information based feature selection and correlation matching based active learning

2017 
In this paper, we develop a novel scheme to reduce the amount of training data required for training deep neural networks (DNNs). We first apply a partial mutual information (PMI) technique to seek for the optimal DNN feature set. Then we use a correlation matching based active learning (CMAL) technique to select and label the most informative training data. We integrate these two techniques with a DNN classifier consisting of layers of unsupervised sparse autoencoders and a supervised softmax layer. Simulations are then conducted over the breast cancer data set from the UCI repository to show that this scheme can drastically reduce the amount of labeled data necessary for the DNN training, and can guarantee the superior performance in reduced training data sets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    15
    References
    9
    Citations
    NaN
    KQI
    []