Data Augmentation Using Synthetic Lesions Improves Machine Learning Detection of Microbleeds from MRI

2018 
Machine learning applied to medical imaging for lesions detection, such as cerebral microbleeds (CMB) from Magnetic Resonance Imaging (MRI), is challenged by the relatively small datasets available for which only subjective and tedious visual reading is available, and by the low prevalence of lesions (a few in ~10% of a typical elderly cohort) resulting in unbalanced classes. Moreover, the lack of actual ground truth might limit the performance of any machine learning method to that of human performance. Yet, the automatic identification of those lesions is relevant to quantify cerebrovascular burden associated with dementia, such as identifying co-morbidity for Alzheimer’s disease. In this paper, we investigated a novel approach consisting of simulating synthetic CMB on SWI MRI scans from healthy individuals to create a large and well characterized training dataset, as a data augmentation strategy. Firstly, we characterized actual CMBs from MRI SWI scans and designed a method to create realistic synthetic CMBs whose location, shape, appearance, and size are similar to actual CMBs. We then tested a supervised neural network classifier using various combinations of actual CMB and synthetic CMBs for training. Augmenting data with synthetic CMBs resulted in a large improvement over training on only actual CMBs only when tested on unseen lesions, and provided better results than other standard data augmentation approaches. Our results suggest that data augmentation using synthetic lesions can address the lack of ground truth and low prevalence limitations for medical imaging analysis allowing the deployment of data hungry supervised learning techniques such as deep learning.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []