Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data.

2021 
BACKGROUND Study design is a critical aspect of any experiment, and sample size calculations for statistical power that are consistent with that study design are central to robust and reproducible results. However, the existing power calculators for tests of differential expression in single-cell RNA-seq data focus on the total number of cells and not the number of independent experimental units, the true unit of interest for power. Thus, current methods grossly overestimate the power. RESULTS Hierarchicell is the first single-cell power calculator to explicitly simulate and account for the hierarchical correlation structure (i.e., within sample correlation) that exists in single-cell RNA-seq data. Hierarchicell, an R-package available on GitHub, estimates the within sample correlation structure from real data to simulate hierarchical single-cell RNA-seq data and estimate power for tests of differential expression. This multi-stage approach models gene dropout rates, intra-individual dispersion, inter-individual variation, variable or fixed number of cells per individual, and the correlation among cells within an individual. Without modeling the within sample correlation structure and without properly accounting for the correlation in downstream analysis, we demonstrate that estimates of power are falsely inflated. Hierarchicell can be used to estimate power for binary and continuous phenotypes based on user-specified number of independent experimental units (e.g., individuals) and cells within the experimental unit. CONCLUSIONS Hierarchicell is a user-friendly R-package that provides accurate estimates of power for testing hypotheses of differential expression in single-cell RNA-seq data. This R-package represents an important addition to single-cell RNA analytic tools and will help researchers design experiments with appropriate and accurate power, increasing discovery and improving robustness and reproducibility.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    19
    References
    2
    Citations
    NaN
    KQI
    []