Federated semi-supervised learning for COVID region segmentation in chest CT using multi-national data from China, Italy, Japan.

2021 
The recent outbreak of Coronavirus Disease 2019 (COVID-19) has led to urgent needs for reliable diagnosis and management of SARS-CoV-2 infection. The current guideline is using RT-PCR for testing. As a complimentary tool with diagnostic imaging, chest Computed Tomography (CT) has been shown to be able to reveal visual patterns characteristic for COVID-19, which has definite value at several stages during the disease course. To facilitate CT analysis, recent efforts have focused on computer-aided characterization and diagnosis with chest CT scan, which has shown promising results. However, domain shift of data across clinical data centers poses a serious challenge when deploying learning-based models. A common way to alleviate this issue is to fine-tune the model locally with the target domains local data and annotations. Unfortunately, the availability and quality of local annotations usually varies due to heterogeneity in equipment and distribution of medical resources across the globe. This impact may be pronounced in the detection of COVID-19, since the relevant patterns vary in size, shape, and texture. In this work, we attempt to find a solution for this challenge via federated and semi-supervised learning. A multi-national database consisting of 1704 scans from three countries is adopted to study the performance gap, when training a model with one dataset and applying it to another. Expert radiologists manually delineated 945 scans for COVID-19 findings. In handling the variability in both the data and annotations, a novel federated semi-supervised learning technique is proposed to fully utilize all available data (with or without annotations). Federated learning avoids the need for sensitive data-sharing, which makes it favorable for institutions and nations with strict regulatory policy on data privacy. Moreover, semi-supervision potentially reduces the annotation burden under a distributed setting. The proposed framework is shown to be effective compared to fully supervised scenarios with conventional data sharing instead of model weight sharing.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    72
    References
    34
    Citations
    NaN
    KQI
    []