MRI‐Based Machine Learning for Differentiating Borderline From Malignant Epithelial Ovarian Tumors: A Multicenter Study

2020 
BACKGROUND: Preoperative differentiation of borderline from malignant epithelial ovarian tumors (BEOT from MEOT) can impact surgical management. MRI has improved this assessment but subjective interpretation by radiologists may lead to inconsistent results. PURPOSE: To develop and validate an objective MRI-based machine-learning (ML) assessment model for differentiating BEOT from MEOT, and compare the performance against radiologists' interpretation. STUDY TYPE: Retrospective study of eight clinical centers. POPULATION: In all, 501 women with histopathologically-confirmed BEOT (n = 165) or MEOT (n = 336) from 2010 to 2018 were enrolled. Three cohorts were constructed: a training cohort (n = 250), an internal validation cohort (n = 92), and an external validation cohort (n = 159). FIELD STRENGTH/SEQUENCE: Preoperative MRI within 2 weeks of surgery. Single- and multiparameter (MP) machine-learning assessment models were built utilizing the following four MRI sequences: T2 -weighted imaging (T2 WI), fat saturation (FS), diffusion-weighted imaging (DWI), apparent diffusion coefficient (ADC), and contrast-enhanced (CE)-T1 WI. ASSESSMENT: Diagnostic performance of the models was assessed for both whole tumor (WT) and solid tumor (ST) components. Assessment of the performance of the model in discriminating BEOT vs. early-stage MEOT was made. Six radiologists of varying experience also interpreted the MR images. STATISTICAL TESTS: Mann-Whitney U-test: significance of the clinical characteristics; chi-square test: difference of label; DeLong test: difference of receiver operating characteristic (ROC). RESULTS: The MP-ST model performed better than the MP-WT model for both the internal validation cohort (area under the curve [AUC] = 0.932 vs. 0.917) and external validation cohort (AUC = 0.902 vs. 0.767). The model showed capability in discriminating BEOT vs. early-stage MEOT, with AUCs of 0.909 and 0.920, respectively. Radiologist performance was considerably poorer than both the internal (mean AUC = 0.792; range, 0.679-0.924) and external (mean AUC = 0.797; range, 0.744-0.867) validation cohorts. DATA CONCLUSION: Performance of the MRI-based ML model was robust and superior to subjective assessment of radiologists. If our approach can be implemented in clinical practice, improved preoperative prediction could potentially lead to preserved ovarian function and fertility for some women. LEVEL OF EVIDENCE: Level 4. TECHNICAL EFFICACY: Stage 2.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    33
    References
    13
    Citations
    NaN
    KQI
    []