Spark Based Parallel Deep Neural Network Model for Classification of Large Scale RNAs into piRNAs and Non-piRNAs

2020 
With recent advancement in computational biology, high throughput next generation sequencing technology has become a de facto standard technology for genes expression studies including DNAs, RNAs and proteins. As a promising technology, it has significant impact on medical sciences and genomic research. However, it generates several millions of short DNA and RNA sequences with several petabytes size in single run. In addition, the raw sequencing datasets such as RNAs are increasing exponentially leading to a big data analytics issue in computational biology. Due to the explosive growth of RNA sequences, the timely classification of RNAs sequence into piRNAs and non-piRNAs have become a challenging issue for traditional technology and conventional machine learning algorithms. Parallel and distributed computing models along with deep neural network have become a major computing platform for big data analytics now required in the field of computational biology. This paper presents a computational model based on parallel deep neural network for timely classification of large number of RNAs sequence into piRNAs and non-piRNAs, taking advantages of parallel and distributed computing platform. The performance of the proposed model was extensively evaluated using two-fold performance metrics. In the first fold, the performance of the proposed model was assessed using accuracy-based metrics such as accuracy, specificity, sensitivity and Matthews’s correlation coefficient. In the second fold, computational-based metrics such as computation times, speedup and scalability were observed. Moreover, initially the performance of the proposed model was assessed using real benchmark dataset and subsequently the performance was assessed using replicated benchmark dataset. The evaluation results in both cases showed that the proposed model improved computation speedup in order of magnitude in comparison with sequential approach without affected accuracy level.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    65
    References
    1
    Citations
    NaN
    KQI
    []