Disease phenotype synonymous prediction through network representation learning from PubMed database

2020 
Abstract Synonym mapping between phenotype concepts from different terminologies is difficult because terminology databases have been developed largely independently. Existing maps of synonymous phenotype concepts from different terminology databases are highly incomplete, and manually mapping is time consuming and laborious. Therefore, building an automatic method for predictive mapping of synonymous phenotypes is of special importance. We propose a classifier-based phenotype mapping prediction model (CPM) to predict synonymous relationships between phenotype concepts from different terminology databases. The model takes network semantic representations of phenotypes as input and predicts synonymous relationships by training binary classifiers with a voting strategy. We compared the performance of the CPM with a similarity-based phenotype mapping prediction model (SPM), which predicts mapping based on the ranked cosine similarity of candidate mapping concepts. Based on a network representation N2V-TFIDF, with a majority voting strategy method MV, the CPM achieved accuracy of 0.943, which was 15.4% higher than that of the SPM using the cosine similarity method (0.789) and 23.8% higher than that of the SSDTM method (0.724) proposed in our previous work.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    50
    References
    2
    Citations
    NaN
    KQI
    []