Learning From Ambiguous Labels for Lung Nodule Malignancy Prediction

Zehui Liao,Yutong Xie,Shishuai Hu,Yong Xia

Learning From Ambiguous Labels for Lung Nodule Malignancy Prediction

2022

Lung nodule malignancy prediction is an essential step in the early diagnosis of lung cancer. Besides the difficulties commonly discussed, the challenges of this task also come from the ambiguous labels provided by annotators, since deep learning models have in some cases been found to reproduce or amplify human biases. In this paper, we propose a multi-view ‘divide-and-rule’ (MV-DAR) model to learn from both reliable and ambiguous annotations for lung nodule malignancy prediction on chest CT scans. According to the consistency and reliability of their annotations, we divide nodules into three sets: a consistent and reliable set (CR-Set), an inconsistent set (IC-Set), and a low reliable set (LR-Set). The nodule in IC-Set is annotated by multiple radiologists inconsistently, and the nodule in LR-Set is annotated by only one radiologist. Although ambiguous, inconsistent labels tell which label(s) is consistently excluded by all annotators, and the unreliable labels of a cohort of nodules are largely correct from the statistical point of view. Hence, both IC-Set and LR-Set can be used to facilitate the training of MV-DAR. Our MV-DAR contains three DAR models to characterize a lung nodule from three orthographic views and is trained following a two-stage procedure. Each DAR consists of three networks with the same architecture, including a prediction network (Prd-Net), a counterfactual network (CF-Net), and a low reliable network (LR-Net), which are trained on CR-Set, IC-Set, and LR-Set respectively in the pretraining phase. In the fine-tuning phase, the image representation ability learned by CF-Net and LR-Net is transferred to Prd-Net by negative-attention module (NA-Module) and consistent-attention module (CA-Module), aiming to boost the prediction ability of Prd-Net. The MV-DAR model has been evaluated on the LIDC-IDRI dataset and LUNGx dataset. Our results indicate not only the effectiveness of the MV-DAR in learning from ambiguous labels but also its superiority over present noisy label-learning models in lung nodule malignancy prediction.

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations