Evaluation of a 3D convolutional neural network algorithm for organ segmentation and quantification of tissue metabolism in FDG-PET/CT

2021 
1439 Objectives: 18F-FDG PET is routinely used to assess glucose metabolism in tumours, providing information on the likelihood of malignancy and on response evaluation during treatment. FDG uptake in tissues not directly involved in disease can provide additional information on response and side-effects, but its clinical utility is largely unassessed due to the impracticality of manually segmenting organs on PET/CT. As a step towards an automated whole-body metabolic survey, we applied a 3D Convolutional Neural Network (CNN) segmentation algorithm[1] to the low-dose CT component of 50 PET/CT studies, evaluating i) the accuracy of segmentation compared with manual segmentations on PET and CT, and ii) the accuracy of PET SUV measures extracted using regions automatically defined on CT compared with those obtained from manual segmentation on PET. Methods: Manual segmentations of the liver and spleen were performed on 50 PET/CT scans of patients with Hodgkin Lymphoma[2]. The segmentations were created separately on PET (PETman) and CT (CTman). An automated image segmentation tool using 3D CNN (U-NET)[1] was applied to the CT components of the same studies (CTaut). i) The similarity of CTaut to CTman was evaluated using DICE and Jaccard similarity indices. ii) CTaut segmentations were transferred to PET space and the quantification of tissue metabolism (SUVmean and SUVpeak) and volume extracted using CTaut was compared to quantification based on PETman. Results: i)When comparing CTman and CTaut for the liver, an average DICE score of 0.93+/-0.03 and Jaccard score of 0.86+/-0.05 was found. For the spleen: DICE score of 0.87+/-0.17 and Jaccard of 0.79+/-0.18.ii) Compared to PETman, CTaut and CTman volumes were 21.6% and 16.2% higher respectively in the liver and 12.7% and 23.9% higher in the spleen. Compared to PETman, the average SUVmean of liver and spleen was 1.9% and 4.1% lower respectively when generated by CTaut, versus 2.8% and 4.3% lower when generated by CTman. However, average SUVPeak increased by 62.2% for the liver but only 0.3% for spleen when using contours generated by CTaut, and 14.5% and 3.9% by CTman. Discussion: DICE and Jaccard scores demonstrated than when compared on CT, CTman and CTaut had a high degree of similarity, with the liver showing closer agreement than the spleen.Whilst overall the agreement between CTaut transferred to PET space and PETman was high in relation to volume, segmentation beyond the boundaries of the organs led to isolated large volume differences which impacted the average change. SUVPeak was found to be highly sensitive to segmentation method, whereas SUVmean was less sensitive. Inspection of the automated segmentations suggested that this was due to the inclusion of nearby organs, specifically the right kidney and the heart, which created SUV hotspots affecting SUVPeak but less so SUVmean. Conclusions: The 3D-UNET for automated segmentation performed very well on this independent dataset with regard to volume, similarity and SUVmean. However, the discrepancies observed in SUVPeak extracted from automated rather than manual segmentations suggests that more accurate segmentations may be required for quantification and emphasizes the need to combine both PET and CT information in methods for automated segmentation of PET/CT images.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []