DNA Sequencing Technologies: Sequencing Data Protocols and Bioinformatics Tools

2019 
The recent advances in DNA sequencing technology, from first-generation sequencing (FGS) to third-generation sequencing (TGS), have constantly transformed the genome research landscape. Its data throughput is unprecedented and severalfold as compared with past technologies. DNA sequencing technologies generate sequencing data that are big, sparse, and heterogeneous. This results in the rapid development of various data protocols and bioinformatics tools for handling sequencing data. In this review, a historical snapshot of DNA sequencing is taken with an emphasis on data manipulation and tools. The technological history of DNA sequencing is described and reviewed in thorough detail. To manipulate the sequencing data generated, different data protocols are introduced and reviewed. In particular, data compression methods are highlighted and discussed to provide readers a practical perspective in the real-world setting. A large variety of bioinformatics tools are also reviewed to help readers extract the most from their sequencing data in different aspects, such as sequencing quality control, genomic visualization, single-nucleotide variant calling, INDEL calling, structural variation calling, and integrative analysis. Toward the end of the article, we critically discuss the existing DNA sequencing technologies for their pitfalls and potential solutions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    154
    References
    36
    Citations
    NaN
    KQI
    []