Similarity-based algorithms for Disease Terminology Mapping

2016 
Classification of diseases and their related terms are important data resources for basic medical research. However, disease terms in different terminological databases are largely developed independent of each other and the mapping relationships between them are not complete. The purpose of this paper is to propose similarity-based disease terminology mapping methods to map disease terms with same or similar semantic concepts in different terminological databases. By integrating the bibliographic medical records from PubMed and the manually curated associations of disease-gene and disease-phenotype, we proposed two methods, namely Semantic-based Similarity for Disease Terminology Mapping (SSDTM) and Information Recommend-based Disease Terminology Mapping (IRDTM) to predict the disease terms in OMIM to MeSH. The experimental results show that both methods can support predicting mapping between disease databases. From leave one out cross validation, the prediction performance of SSDTM (Hits@10: 87.3%) is better than IRDTM; From manual evaluation, the hits rate in top 10 of SSDTM is 94.4%. The similarity-based disease terminology mapping method can be applied with the supplement of manual review for different mainstream disease terminology databases and help improve the efficiency of integrated medical ontology development and translational bioinformatics that need incorporate multiple data sources from different disciplines.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []