SeeCiTe: a method to assess CNV calls from SNP arrays using trio data.

2021 
Motivation Single nucleotide polymorphism (SNP) genotyping arrays remain an attractive platform for assaying copy number variants (CNVs) in large population-wide cohorts. However current tools for calling CNVs are still prone to extensive false positive calls when applied to biobank scale arrays. Moreover, there is a lack of methods exploiting cohorts with trios available (e.g. nuclear family) to assist in quality control and downstream analyses following the calling. Results We developed SeeCiTe (Seeing Cnvs in Trios), a novel CNV quality control tool that post-processes output from current CNV calling tools exploiting child-parent trio data to classify calls in quality categories and provide a set of visualizations for each putative CNV call in the offspring. We apply it to the Norwegian Mother, Father, and Child Cohort Study (MoBa) and show that SeeCiTe improves the specificity and sensitivity compared to the common empiric filtering strategies. To our knowledge it is the first tool that utilizes probe-level CNV data in trios (and singletons) to systematically highlight potential artefacts and visualize signal intensities in a streamlined fashion suitable for biobank scale studies. Availability and implementation The software is implemented in R with the source code freely available at https://github.com/aksenia/SeeCiTe. Supplementary information Supplementary data are available at Bioinformatics online.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    25
    References
    1
    Citations
    NaN
    KQI
    []