Factors Affecting Accuracy of Genotype Imputation Using Neural Networks in Deep Learning

2021 
The genotype imputation is an important topic in the field of genomics. Many genome analyses require data without missing values, which requires to impute the missing data. In recent years, deep learning has become hot, and it is more suitable for text sequence type problems, which may fit with the genotype imputation problem. Based on the recurrent neural network and convolutional neural network in deep learning, our study proposes and constructs five model combinations, imputes and compares the results under different missing rate scenarios. And on the basis of the basic model, a higher imputation accuracy is obtained by tuning the model hyperparameters. The results indicated that on all the data sets with various levels of missing rates, the CNN1D-RNNM with tuned hyperparameters well has obtained the best results. The combination of a one-dimensional convolutional neural network and a recurrent neural network with tuned hyperparameters can beat a single convolutional network or a recurrent network at various levels of missing rates. This research provides new solutions for genotype imputation by using the deep learning to build complex neural networks.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    6
    References
    0
    Citations
    NaN
    KQI
    []