A unified framework for integrative study of heterogeneous gene regulatory mechanisms

2020 
Gene expression is regulated by a large variety of mechanisms. Previous studies attempting to model the quantitative relationships between gene expression levels and regulatory mechanisms have considered only one or a few mechanisms at a time, which cannot provide a full picture of the complex interactions among different mechanisms. This was partially due to the heterogeneity of the mechanisms, which involve different types of biological objects and data representations, making it hard to study them in a unified way. Here, we describe a flexible framework that can integrate very different types of data for studying their joint effects on gene expression. In this framework, domain knowledge is represented by metapaths, while the manifestations of their effects in actual data are summarized by an embedding of the biological objects in a latent space. We demonstrate the use of our framework in integrating several diverse types of data that are related to gene expression in different ways, including DNA contacts in three-dimensional genome architecture, protein–protein interactions, genomic neighbourhoods and broad chromatin accessibility domains. The modelling results reveal that these several types of data are able to model gene expression fairly well individually, but even better when integrated. Gene expression is regulated by a variety of mechanisms, which have been difficult to study in a unified way. The authors propose a flexible framework that can integrate different types of data for studying their joint effects on gene expression. The framework uses a general network representation for data integration, metapaths for inputting prior knowledge of gene regulatory mechanisms, and embedding techniques for capturing complex structures in the data.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    42
    References
    1
    Citations
    NaN
    KQI
    []