Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies

Tamar Sofer,Nuzulul Kurniansyah,François Aguet,Kristin Ardlie,Peter Durda,Deborah A. Nickerson,Joshua D. Smith,Yongmei Liu,Sina A. Gharib,Susan Redline,Stephen S. Rich,Jerome I. Rotter,Kent D. Taylor

Benchmarking Association Analyses of Continuous Exposures with RNA-seq in Observational Studies

2021

Large datasets of hundreds to thousands of individuals measuring RNA-seq in observational studies are becoming available. Many popular software packages for analysis of RNA-seq data were constructed to study differences in expression signatures in an experimental design with well-defined conditions (exposures). In contrast, observational studies may have varying levels of confounding transcript-exposure associations; further, exposure measures may vary from discrete (exposed, yes/no) to continuous (levels of exposure), with non-normal distributions of exposure. We compare popular software for gene expression-DESeq2, edgeR and limma-as well as linear regression-based analyses for studying the association of continuous exposures with RNA-seq. We developed a computation pipeline that includes transformation, filtering and generation of empirical null distribution of association P-values, and we apply the pipeline to compute empirical P-values with multiple testing correction. We employ a resampling approach that allows for assessment of false positive detection across methods, power comparison and the computation of quantile empirical P-values. The results suggest that linear regression methods are substantially faster with better control of false detections than other methods, even with the resampling method to compute empirical P-values. We provide the proposed pipeline with fast algorithms in an R package Olivia, and implemented it to study the associations of measures of sleep disordered breathing with RNA-seq in peripheral blood mononuclear cells in participants from the Multi-Ethnic Study of Atherosclerosis.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations