Proper Conditional Analysis in the Presence of Missing Data Identified Novel Independently Associated Low Frequency Variants in Nicotine Dependence Genes

2017 
Meta-analysis of genetic association studies increases sample size and the power for mapping complex traits. Existing methods are mostly developed for datasets without missing values. In practice, genotype imputation is not always effective, e.g. when targeted genotyping/sequencing assays are used or when the un-typed genetic variant is rare. Therefore, contributed summary statistics often contain missing values. Naive extensions of existing methods either replace missing summary statistics with 0 or discard studies with missing data. These approaches can bias genetic effect estimates and lead to seriously inflated type-I or II errors in conditional analysis, which is a critical tool for identifying independently associated variants. To address this challenge and complement imputation methods, we developed a method to combine summary statistics across participating studies and consistently estimate joint effects, even when the contributed summary statistics contain large amount of missing values. Based on this estimator, we propose a score statistic we call PCBS (partial correlation based score statistic) for conditional analysis of single-variant and gene-level associations. Through extensive analysis of simulated and real data, we showed that the new method produces well-calibrated type-I errors and is substantially more powerful than existing approaches. We applied the proposed approach to analyze the CHRNA5-CHRNB4-CHRNA3 locus in a large-scale meta-analysis for cigarettes-per-day. Using the new method, we identified three novel variants, independent of known association signals, which were otherwise missed by alternative methods. Together, the phenotypic variance explained by these variants is .46%, improving that of previously reported associations by 17%. These findings illustrate the extent of locus allelic heterogeneity and can help pinpoint causal variants.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    28
    References
    2
    Citations
    NaN
    KQI
    []