Fast and accurate assembly of Nanopore reads via progressive error correction and adaptive read selection

2020 
Although long Nanopore reads are advantageous in de novo genome assembly, applying Nanopore reads in genomic studies is still hindered by their complex errors. Here, we developed NECAT, an error correction and de novo assembly tool designed to overcome complex errors in Nanopore reads. We proposed an adaptive read selection and two-step progressive method to quickly correct Nanopore reads to high accuracy. We introduced a two-stage assembler to utilize the full length of Nanopore reads. NECAT achieves superior performance in both error correction and de novo assembly of Nanopore reads. NECAT requires only 7,225 CPU hours to assemble a 35X coverage human genome and achieves a 2.28-fold improvement in NG50. Furthermore, our assembly of the human WERI cell line showed an NG50 of 29 Mbp. The high-quality assembly of Nanopore reads can significantly reduce false positives in structure variation detection.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    22
    Citations
    NaN
    KQI
    []