Detection of genome-wide low-frequency mutations with Paired-End and Complementary Consensus Sequencing (PECC-Seq) revealed end-repair derived artifacts as residual errors

2019 
To improve the accuracy and the cost-efficiency of next-generation sequencing in ultralow-frequency mutation detection, we developed the Paired-End and Complementary Consensus Sequencing (PECC-Seq), a PCR-free duplex consensus sequencing approach. PECC-Seq employed shear points as endogenous barcodes to identify consensus sequences from the overlap in the shortened, complementary DNA strands-derived paired-end reads for sequencing error correction. With the high accuracy of PECC-Seq, we identified the characteristic base substitution errors introduced by the end-repair process of mechanical fragmentation-based library preparations, which were prominent at the terminal 6 bp of the library fragments in the 59-NpCpA-39 or 59-NpCpT-39 trinucleotide context. As demonstrated at the human genome scale (TK6 cells), after removing these potential end-repair artifacts from the terminal 6 bp, PECC-Seq could reduce the sequencing error frequency to mid-10-7 with a relatively low sequencing depth. For TA base pairs, the background error rate could be suppressed to mid-10-8. In mutagen-treated TK6, slight increases in mutagen treatment-related mutant frequencies could be detected, indicating the potential of PECC-Seq in detecting genome-wide ultra-rare mutations. In addition, our finding on the patterns of end-repair artifacts may provide new insights in further reducing technical errors not only for PECC-Seq, but also for other next-generation sequencing techniques.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    44
    References
    0
    Citations
    NaN
    KQI
    []