Identifying Enhancers and Their Strength by the Integration of Word Embedding and Convolution Neural Network

2020 
The enhancer is a short regulatory element that plays a major role in up-regulating eukaryotic gene expression. To identify enhancers, an experimental process takes a long time and high cost; therefore, an accurate computational tool is a much-needed work in this area. Existing techniques were developed by the use of handcrafted features followed by machine learning techniques, while the proposed model extracts the features of enhancers from raw DNA sequences by the integration of natural language processing (NLP) technique using word2vec and convolutional neural network (CNN). Therefore, an accurate computational tool, iEnhancer-CNN, is developed. The developed tool can predict enhancers and their strength. The evaluation results show that iEnhancer-CNN is remarkably superior to the existing state-of-the-art models. In more detail, iEnhancer-CNN improved the accuracy of enhancer and enhancer strength identification by 2.6% and 11.4%, respectively. A web server for the iEnhancer-CNN is freely available at https://home.jbnu.ac.kr/NSCL/iEnhancer-CNN.htm.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    48
    References
    13
    Citations
    NaN
    KQI
    []