Mining all publicly available expression data to compute dynamic microbial transcriptional regulatory networks

2021 
Abstract We are firmly in the era of biological big data. Millions of omics datasets are publicly accessible and can be employed to support scientific research or build a holistic view of an organism. Here, we introduce a workflow that converts all public gene expression data for a microbe into a dynamic representation of the organism’s transcriptional regulatory network. This five-step process walks researchers through the mining, processing, curation, analysis, and characterization of all available expression data, using Bacillus subtilis as an example. The resulting reconstruction of the B. subtilis regulatory network can be leveraged to predict new regulons and analyze datasets in the context of all published data. The results are hosted at https://imodulondb.org/, and additional analyses can be performed using the PyModulon Python package. As the number of publicly available datasets increases, this pipeline will be applicable to a wide range of microbial pathogens and cell factories.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    66
    References
    4
    Citations
    NaN
    KQI
    []