Tolerating Data Missing in Breast Cancer Diagnosis from Clinical Ultrasound Reports via Knowledge Graph Inference

2021 
Medical diagnosis through artificial intelligence has been drawing increasing attention currently. For breast lesions, the clinical ultrasound reports are the most commonly used data in the diagnosis of breast cancer. Nevertheless, the input reports always encounter the inevitable issue of data missing. Unfortunately, despite the efforts made in previous approaches that made progress on tackling data imprecision, nearly all of these approaches cannot accept inputs with data missing. A common way to alleviate the data missing issue is to fill the missing values with artificial data. However, the data filling strategy actually brings in additional noises that do not exist in the raw data. Inspired by the advantage of open world assumption, we regard the missing data in clinical ultrasound reports as non-observed terms of facts, and propose a Knowledge Graph embedding based model KGSeD with the capability of tolerating data missing, which can successfully circumvent the pollution caused by data filling. Our KGSeD is designed via an encoder-decoder framework, where the encoder incorporates structural information of the graph via embedding, and the decoder diagnose patients by inferring their links to clinical outcomes. Comparative experiments show that KGSeD achieves noticeable diagnosis performances. When data missing occurred, KGSeD yields the most stable performance over those of existing approaches, showing better tolerance to data missing.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    22
    References
    2
    Citations
    NaN
    KQI
    []