Promoter Identification in DNA Sequences Using Machine Learning

2020 
The quest to extract vital information regarding protein-coding genes has always fascinated scientific community. However, there is scarcity of experimental data on internal mechanism controlling the synthesis of functional gene product. This is due to the fact that the process of identification of transcription site is highly complex. In order to understand transcription, it is essential to identify promoters since promoter sequences define where transcription of a gene begins. The use of Machine Learning has provided substantially accurate results as compared to conventional methods. This research paper aims at classification of short E. Coli DNA Sequences into Promoter and Non- Promoter category using machine learning algorithms like AdaBoost Classifier and Multilayer Perceptron Neural Network with a higher accuracy than existing methodologies. It also compares the accuracy of algorithms such as Support Vector Classifier (‘RBF' and ‘Sigmoid' Kernel) and Gaussian Process Classifier that were not used before in promoter identification.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    4
    References
    0
    Citations
    NaN
    KQI
    []