CopyMix: Mixture Model Based Single-Cell Clustering and Copy Number Profiling using Variational Inference

2020 
Motivation: Single-cell sequencing technologies are becoming increasingly more established, in particular, in the study of tumor heterogeneity, i.e., the cell subpopulations that a cancer tumor typically comprises. Investigating tumor heterogeneity is imperative to better understand how tumors evolve since each of cell subpopulation harbors a unique set of genomic features that yields a unique phenotype, an issue that is bound to have clinical relevance. Clustering of cells based on copy number data, obtained from single-cell DNA sequencing, provides an opportunity to assess different tumor cell subpopulations. Accordingly, computational methods have emerged for detecting single-cell copy number variations (copy number profiling) as well as clustering; however, these two tasks have up to now been handled sequentially with various ad-hoc preprocessing steps lacking an automated, generalized and fully probabilistic framework. Results: We propose CopyMix, a novel probabilistic mixture model based method for single-cell clustering and copy number profiling using Variational Inference, to simultaneously cluster cells and infer copy number profiles corresponding to the clusters. CopyMix is evaluated using simulated data as well as published biological data from metastatic colorectal cancer. The results reveal high V-measures for clustering and low errors in copy number inference. These favorable results indicate a considerable potential to obtain clinical impact by using CopyMix in studies of cancer tumor heterogeneity.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    2
    Citations
    NaN
    KQI
    []