Aerolysin Nanopore Identification of Single Nucleotides Using the AdaBoost Model

2019 
Nanopores employ the ionic current from the single molecule blockage to identify the structure, conformation, chemical groups and charges of a single molecule. Despite the tremendous development in designing sensitive pore-forming materials, at some extent, the analyte with the single group difference still exhibits similar residual current or duration time. The serious overlap in the statistical results of residual current and duration time brings the difficulties in the nanopore discrimination of each single molecules from the mixture. In this paper, we present the AdaBoost-based machine learning model to identify the multiple analyte with single group difference in the mixed blockages. A set of feature vectors which is obtained from Hidden Markov Model (HMM) is used to train the AdaBoost model. By employing the aerolysin sensing of 5ʹ-AAAA-3ʹ (AA3) and 5ʹ-GAAA-3ʹ (GA3) as the model system, our results show that AdaBoost model increases the identification accuracy from ~ 0.293 to above 0.991. Furthermore, five sets of mixed blockages of AA3 and GA3 further validate the average accuracy of training and validation, which are 0.997 and 0.989, respectively. The proposed methods improve the capacity of wild-type biological nanopore in efficiently identify the single nucleotide difference without designing of protein and optimizing of the experimental condition. Therefore, the AdaBoost-based machine learning approach could promote the nanopore practical application such as genetic and epigenetic detection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    27
    References
    6
    Citations
    NaN
    KQI
    []