A Data Integration Approach for Detecting Biomarkers of Breast Cancer Survivability.

2020 
We introduce a network-based approach to identify subnets of functionally-related genes for predicting 5-year survivability of breast cancer patients treated with chemotherapy, hormone therapy, and a combination of these. A gene expression dataset and a protein-protein interaction network are integrated to construct a weighted graph, where edge weight expresses the predictability of the two corresponding genes in predicting the class. We propose a scoring criterion to measure the density of a weighted sub-graph, which is also an estimation of its predictive power. Thus, we can identify an optimally-dense sub-network for each seed gene, and then evaluate that sub-network by classification method. Finally, among the sub-networks whose classification performance greater than a given threshold, we search for an optimal set of sub-networks that can further improve classification performance via a voting scheme. We significantly improved the results of existing approaches. For each type of treatment, our best prediction model can reach 85% accuracy or more. Many selected sub-networks used to construct the voting models contain breast/other cancer-related genes including SP1, TP53, MYC, NOG, and many more, providing pieces of evidence for down-stream analysis.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    20
    References
    1
    Citations
    NaN
    KQI
    []