Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer

被引:8
|
作者
Song, Bofan [1 ]
Raj, Dharma K. C. [2 ]
Yang, Rubin Yuchan [2 ]
Li, Shaobai [1 ]
Zhang, Chicheng [2 ]
Liang, Rongguang [1 ]
机构
[1] Univ Arizona, Wyant Coll Opt Sci, Tucson, AZ 85721 USA
[2] Univ Arizona, Comp Sci Dept, Tucson, AZ 85721 USA
关键词
Vision Transformer; Swin Transformer; oral cancer; oral image analysis; artificial intelligence;
D O I
10.3390/cancers16050987
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Simple Summary Transformer models, originally successful in natural language processing, have found application in computer vision, demonstrating promising results in tasks related to cancer image analysis. Despite being one of the prevalent and swiftly spreading cancers globally, there is a pressing need for accurate automated analysis methods for oral cancer. This need is particularly critical for high-risk populations residing in low- and middle-income countries. In this study, we evaluated the performance of the Vision Transformer (ViT) and the Swin Transformer in the classification of mobile-based oral cancer images we collected from high-risk populations. The results showed that the Swin Transformer model achieved higher accuracy than the ViT model, and both transformer models work better than the conventional convolution model VGG19.Abstract Oral cancer, a pervasive and rapidly growing malignant disease, poses a significant global health concern. Early and accurate diagnosis is pivotal for improving patient outcomes. Automatic diagnosis methods based on artificial intelligence have shown promising results in the oral cancer field, but the accuracy still needs to be improved for realistic diagnostic scenarios. Vision Transformers (ViT) have outperformed learning CNN models recently in many computer vision benchmark tasks. This study explores the effectiveness of the Vision Transformer and the Swin Transformer, two cutting-edge variants of the transformer architecture, for the mobile-based oral cancer image classification application. The pre-trained Swin transformer model achieved 88.7% accuracy in the binary classification task, outperforming the ViT model by 2.3%, while the conventional convolutional network model VGG19 and ResNet50 achieved 85.2% and 84.5% accuracy. Our experiments demonstrate that these transformer-based architectures outperform traditional convolutional neural networks in terms of oral cancer image classification, and underscore the potential of the ViT and the Swin Transformer in advancing the state of the art in oral cancer image analysis.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] An optimized bidirectional vision transformer based colorectal cancer detection using histopathological images
    Choudhary, Raman
    Deepak, Akshay
    Krishnasamy, Gopalakrishnan
    Kumar, Vikash
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 102
  • [42] Dense Swin Transformer for Classification of Thyroid Nodules
    Baima, Namu
    Wang, Tianfu
    Zhao, Chong-Ke
    Chen, Siping
    Zhao, Chen
    Lei, Baiying
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,
  • [43] Resizer Swin Transformer-Based Classification Using sMRI for Alzheimer's Disease
    Huang, Yihang
    Li, Wan
    APPLIED SCIENCES-BASEL, 2023, 13 (16):
  • [44] Segmentation of Cerebral Hemorrhage CT Images using Swin Transformer and HarDNet
    Piao, Zhegao
    Gu, Yeong Hyeon
    Yoo, Seong Joon
    Seong, Myoungho
    2023 INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING, ICOIN, 2023, : 522 - 525
  • [45] An efficient approach to detect and segment underwater images using Swin Transformer
    Pavithra, S.
    Denny, J. Cicil Melbin
    RESULTS IN ENGINEERING, 2024, 23
  • [46] Classification of Muscular Dystrophies from MR Images Improves Using the Swin Transformer Deep Learning Model
    Mastropietro, Alfonso
    Casali, Nicola
    Taccogna, Maria Giovanna
    D'Angelo, Maria Grazia
    Rizzo, Giovanna
    Peruzzo, Denis
    BIOENGINEERING-BASEL, 2024, 11 (06):
  • [47] AnisotropicBreast-ViT: Breast Cancer Classification in Ultrasound Images Using Anisotropic Filtering and Vision Transformer
    Diniz, Joao Otavio Bandeira
    Ribeiro, Neilson P.
    Dias, Domingos A., Jr.
    da Cruz, Luana B.
    da Silva, Giovanni L. F.
    Gomes, Daniel L., Jr.
    de Paiva, Anselmo C.
    Silva, Aristofanes C.
    INTELLIGENT SYSTEMS, BRACIS 2024, PT III, 2025, 15414 : 95 - 109
  • [48] Cerebrovascular segmentation from mesoscopic optical images using Swin Transformer
    Li, Yuxin
    Zhang, Qianlong
    Zhou, Hang
    Li, Junhuai
    Li, Xiangning
    Li, Anan
    JOURNAL OF INNOVATIVE OPTICAL HEALTH SCIENCES, 2023, 16 (04)
  • [49] Speckle noise reduction for digital holographic images using Swin Transformer
    Xie, Zhaoqian
    Chen, Li
    Chen, Honghui
    Wen, Kunhua
    Guo, Junwei
    OPTICS AND LASERS IN ENGINEERING, 2025, 184
  • [50] Diabetic Retinopathy Classification using Vision Transformer
    Mutawa, A. M.
    Sruthi, Sai
    2022 6TH EUROPEAN CONFERENCE ON ELECTRICAL ENGINEERING & COMPUTER SCIENCE, ELECS, 2022, : 25 - 30