Identification of Discriminative Gene-level and Protein-level Features Associated with Gain-of-Function and Loss-of-Function Mutations

2021 
Gain-of-function (GOF) and loss-of-function (LOF) mutations in the same gene may result in markedly different clinical phenotypes and hence require different therapeutic treatments. Identifying the functional consequences of mutations is an important step toward understanding disease mechanisms. While there are numerous computational tools (e.g., CADD, SIFT, PolyPhen-2) to predict the pathogenicity of a variant, there are currently no methods to predict whether a given genetic mutation results in a gene product with increased (gain-of-function; GOF) or diminished (loss-of-function; LOF) activity. Here, we investigated various gene- and protein-level features of GOF and LOF mutations. We generated the first extensive database of all currently known germline GOF and LOF pathogenic mutations by employing natural language processing (NLP) on the available abstracts in the Human Gene Mutation Database. Machine learning and statistical analyses of gene- and protein-level features associated with GOF and LOF mutations indicated significant differences. For example, GOF mutations were enriched in essential genes, autosomal dominant inheritance, protein binding and interaction domains, whereas LOF mutations were enriched in singleton genes, protein-truncating variants, and protein core regions. These results are consistent with the notion that mutations underlying recessive and dominant disorders have significantly different structural and functional properties. These findings could ultimately improve our understanding of how mutations affect gene/protein function thereby guiding future treatment options.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    71
    References
    0
    Citations
    NaN
    KQI
    []