PanoView: An iterative clustering for single-cell RNA sequencing data

2019 
Abstract Single-cell RNA-sequencing (scRNA-seq) provides new opportunities to gain a mechanistic understanding of many biological processes. Current approaches for single cell clustering are often sensitive to the input parameters and have difficulty dealing with cell types with different densities. Here, we present Panoramic View (PanoView), an iterative method integrated with a novel density-based clustering, Ordering Local Maximum by Convex hull (OLMC), that uses a heuristic approach to estimate the required parameters based on the input data structures. In each iteration, PanoView will identify the most confident cell clusters and repeat the clustering with the remaining cells in a new PCA space. Without adjusting any parameter in PanoView, we demonstrated that PanoView was able to detect major and rare cell types simultaneously and outperformed other existing methods in both simulated datasets and published single-cell RNA-sequencing datasets. Finally, we conducted scRNA-Seq analysis of embryonic mouse hypothalamus, and PanoView was able to reveal known cell types and several rare cell subpopulations. Author summary One of the important tasks in analyzing single-cell transcriptomics data is to classify cell subpopulations. Most computational methods require users to input parameters and sometimes the proper parameters are not intuitive to users. Hence, a robust but easy-to-use method is of great interest. We proposed PanoView algorithm that utilizes an iterative approach to search cell clusters in an evolving three-dimension PCA space. The goal is to identify the cell cluster with the most confidence in each iteration and repeat the clustering algorithm with the remaining cells in a new PCA space. To cluster cells in a given PCA space, we also developed OLMC clustering to deal with clusters with varying densities. We examined the performance of PanoView in comparison to other existing methods using ten published single-cell datasets and simulated datasets as the ground truth. The results showed that PanoView is an easy-to-use and reliable tool and can be applied to diverse types of single-cell RNA-sequencing datasets.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []