Systematic and comprehensive benchmarking of an exome sequencing based germline copy-number analysis pipeline to detect clinically relevant CNVs

2019 
Abstract Purpose Detecting germline copy-number variants (CNVs) from exome sequencing (ES) is not a standard practice in clinical settings owing to several reasons concerning performance. We comprehensively characterized an ES-based CNV pipeline and developed frameworks for minimizing false-positives and assess the reproducibility. Methods We used a cohort of 387 individuals with both clinical chromosomal microarray (CMA) and ES data available to estimate the initial performance by comparing CNVs from both platforms. A modification of the default workflow was performed to reduce the number of false positives and the reproducibility of the CNVs was assessed using an iterative variant calling process. Results The default pipeline was 93% sensitive with a high false-discovery rate of 44%. The modified workflow had a higher sensitivity of 96% while reducing the total number of CNVs identified and improving the false-discovery rate to 11.4%. With the modified workflow, we demonstrated a 100% validation rate for the CNVs identified in the STRC, a challenging gene to ascertain by short-read NGS. The exome-based pipeline was 100% sensitive for clinically-relevant, rare variants (including single exon deletions), and was reproducible. Conclusion We demonstrate with our modified workflow and the benchmarking data that an exome-based CNV detection pipeline can be reliably used to detect clinically-relevant CNVs.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    29
    References
    0
    Citations
    NaN
    KQI
    []