Reconstructing a Genotype-Phenotype Map In Chronic Lymphocytic Leukemia

2013 
CLL exhibits great variability in genetic alterations and clinical outcome across patients. Gene expression can serve as an intermediate measurement to provide insight into the cellular circuitry linking genotype to phenotype. To explore this approach, we developed methods to statistically assess the associations between somatic mutations, transcriptional programs, and clinical outcome. First, in a cohort of 229 CLL patients, we associated IGHV -mutation status as well as 7 recurrent (>5% of CLLs) somatic alterations with 44 modules of co-expressed genes that were derived by clustering 3719 variably expressed genes measured by Affymetrix arrays. The somatic mutation genotypes were identified by whole exome sequencing (N=130) and/or SNP6 arrays (N=229). By permutation testing, we determined that 21 modules were associated with IGHV mutation status (P IGHV mutation status, we identified mutated genotypes associated with independent expression modules (trisomy 12 [n=10], ATM/del11q [n=12], SF3B1 [n=6], TP53/del17p [n=5], del13q [n=2], MYD88 alone [n=1], NOTCH1 [n=0] and some modules associated with more than one genotype (P Second, having identified transcriptional modules associated with distinct genotypes, we sought to understand the functions of these modules and to infer the regulators of these programs. To associate modules with potential functional phenotypes, we performed gene enrichment analysis, and found multiple modules associated with inflammatory signatures, DNA repair and MYC targets. We then populated the intermediary layer between genotype and module with candidate transcription factors (TFs) by integrating curated TF datasets with TF expression, motif analysis and module expression. For example, CREB and ATF were candidate TF regulators in modules associated with SF3B 1 and ATM mutations, respectively; while MYC- and NFkB-related TFs were candidate regulators of modules associated with both trisomy 12 or MYD88 mutations. We also identified several candidates of cellular convergence, where multiple genotypes lead to activation of the same transcriptional program. For example, the transcription factor EBF1, an important B cell regulator, was nominated as a candidate regulator in 8 of 44 modules which were associated with differing genotypes, suggesting the importance of EBF1 in mediating the genotype-phenotype relationship. Third, to complete the map from genotype to phenotype, we linked module expression with a clinical outcome. Using elastic-net Cox regression, we identified 2 modules associated with longer and 6 with shorter time from sample acquisition to treatment or death. Many of these modules were associated with well-established prognostic indicators (6 with IGHV status, 1 with P53/del17p, and 1 with SF3B1 status P -11 ) and with immune system activation (P=2 -6 ) gene-sets. We used the expression-based Cox-regression index to classify patients into high and low risk subgroups (logrank P=6.1 -9 ). Ongoing work seeks to assess the predictive power of our gene-expression signature in relation to traditional prognostic measures, as well as to further annotate the outcome-associated modules. In summary, this analysis serves as a proof of principle for a ‘genotype-phenotype map’ for CLL linking somatic alterations, gene expression programs, and clinical outcome. Our inferred CLL networks generate testable hypotheses that explain how genotypes affect the cellular circuitry of CLL cells that are currently being tested through functional gain/loss of function experiments. Disclosures: Brown: Pharmacyclics, Genentech, Celgene, Emergent, Onyx, Sanofi Aventis, Vertex, Avila, Novartis: Consultancy; Genzyme, Celgene: Research Funding.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    1
    Citations
    NaN
    KQI
    []