Caring without sharing: Meta-analysis 2.0 for massive genome-wide association studies

2018 
Genome-wide association studies have been effective at revealing the genetic architecture of simple traits. Extending this approach to more complex phenotypes has necessitated a massive increase in cohort size. To achieve sufficient power, participants are recruited across multiple collaborating institutions, leaving researchers with two choices: either collect all the raw data at a single institution or rely on meta-analyses to test for association. In this work, we present a third alternative. Here, we implement an entire GWAS workflow (quality control, population structure control, and association) in a fully decentralized setting. Our iterative approach (a) does not rely on consolidating the raw data at a single coordination center, and (b) does not hinge upon large sample size assumptions at each silo. As we show, our approach overcomes challenges faced by meta-studies when it comes to associating rare alleles and when case/control proportions are wildly imbalanced at each silo. We demonstrate the feasibility of our method in cohorts ranging in size from 2K (small) to 500K (large), and recruited across 2 to 10 collaborating institutions.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    48
    References
    0
    Citations
    NaN
    KQI
    []