A vision transformer for emphysema classification using CT images

2021 
Objective Emphysema is characterized by the destruction and permanent enlargement of the alveoli in the lung. According to visual CT appearance, emphysema can be divided into three subtypes: centrilobular emphysema (CLE), panlobular emphysema (PLE), and paraseptal emphysema (PSE). Automating emphysema classification can help precisely determine the patterns of lung destruction and provide a quantitative evaluation. Approach We propose a vision transformer (ViT) model to classify the emphysema subtypes via CT images. First, large patches (61×61) are cropped from CT images which contain the area of normal lung parenchyma (NLP), CLE, PLE, and PSE. After resizing, the large patch is divided into small patches and these small patches are converted to a sequence of patch embeddings by flattening and linear embedding. A class embedding is concatenated to the patch embedding, and the positional embedding is added to the resulting embeddings described above. Then, the obtained embedding is fed into the transformer encoder blocks to generate the final representation. Finally, the learnable class embedding is fed to a softmax layer to classify the emphysema. Main results To overcome the lack of massive data, the transformer encoder blocks (pre-trained on ImageNet) are transferred and fine-tuned in our ViT model. The average accuracy of the pre-trained ViT model achieves 95.95% in our lab's own dataset which is higher than that of AlexNet, Inception-V3, MobileNet-V2, ResNet34, and ResNet50. Meanwhile, the pre-trained ViT model outperforms the ViT model without the pre-training. The accuracy of our pre-trained ViT model is higher than or comparable to that by available methods for the public dataset. Significance The results demonstrated that the proposed ViT model can accurately classify the subtypes of emphysema using CT images. The ViT model can help make an effective computer-aided diagnosis of emphysema, and the ViT method can be extended to other medical applications.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    0
    References
    0
    Citations
    NaN
    KQI
    []