Metrics to evaluate the performance of auto-segmentation for radiation treatment planning: a critical review.

2021 
Abstract Advances in artificial intelligence-based methods have led to the development and publication of numerous systems for auto-segmentation in radiotherapy. These systems have the potential to decrease contouring variability, which has been associated with poor clinical outcomes, and increase efficiency in the treatment planning workflow. However, there are no uniform standards for evaluating auto-segmentation platforms to assess their efficacy at meeting these goals. Here, we review the most frequently used evaluation techniques which include geometric overlap, dosimetric parameters, time spent contouring, and clinical rating scales. These data suggest that many of the most commonly used geometric indices, such as the Dice Similarity Coefficient, are not well correlated with clinically meaningful endpoints. As such, a multi-domain evaluation, including composite geometric and/or dosimetric metrics with physician-reported assessment, is necessary to gauge the clinical readiness of auto-segmentation for radiation treatment planning.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    75
    References
    8
    Citations
    NaN
    KQI
    []