Transcriptome Spectra: Agnostic Expression Variables To Empower Genomic Epidemiology Studies

2020 
Cancers are highly heterogeneous diseases and large molecular datasets are increasingly part of describing an individuals unique experience. Gene expression is particularly attractive because it captures both genetic and environmental consequences. Our new approach, SPECTRA, provides a framework of agnostic multi-gene linear equations to calculate variables tuned to the needs of genomic epidemiology studies. SPECTRA variables are not supervised to an outcome. They are quantitative, linearly uncorrelated variables that retain integrity to the original data and cumulatively explain the majority of the global population variance. Together these variables represent a deep dive into the transcriptome, including both large and small sources of variance. The latter is often over-looked, but holds potential for the identification of smaller groups of individuals with large effects and important for developing precision strategies. Each SPECTRA variable is a quantitative tissue phenotype that can be considered a phenotypic outcome providing new avenues to explore disease risk. Also, as a set of SPECTRA variables, they are ideal for modeling alongside other variables as predictors for any clinical outcome of interest. We demonstrate the flexibility of SPECTRA variables for multiple endpoints using RNA sequencing from 767 myeloma patients in the CoMMpass study. Quantitative transcriptome SPECTRA variables enhance the tools researchers have available for incorporating expression in studies to advance precision screening, prevention, intervention, and survival.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    49
    References
    1
    Citations
    NaN
    KQI
    []