Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer

被引:8
|
作者
Song, Bofan [1 ]
Raj, Dharma K. C. [2 ]
Yang, Rubin Yuchan [2 ]
Li, Shaobai [1 ]
Zhang, Chicheng [2 ]
Liang, Rongguang [1 ]
机构
[1] Univ Arizona, Wyant Coll Opt Sci, Tucson, AZ 85721 USA
[2] Univ Arizona, Comp Sci Dept, Tucson, AZ 85721 USA
关键词
Vision Transformer; Swin Transformer; oral cancer; oral image analysis; artificial intelligence;
D O I
10.3390/cancers16050987
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Simple Summary Transformer models, originally successful in natural language processing, have found application in computer vision, demonstrating promising results in tasks related to cancer image analysis. Despite being one of the prevalent and swiftly spreading cancers globally, there is a pressing need for accurate automated analysis methods for oral cancer. This need is particularly critical for high-risk populations residing in low- and middle-income countries. In this study, we evaluated the performance of the Vision Transformer (ViT) and the Swin Transformer in the classification of mobile-based oral cancer images we collected from high-risk populations. The results showed that the Swin Transformer model achieved higher accuracy than the ViT model, and both transformer models work better than the conventional convolution model VGG19.Abstract Oral cancer, a pervasive and rapidly growing malignant disease, poses a significant global health concern. Early and accurate diagnosis is pivotal for improving patient outcomes. Automatic diagnosis methods based on artificial intelligence have shown promising results in the oral cancer field, but the accuracy still needs to be improved for realistic diagnostic scenarios. Vision Transformers (ViT) have outperformed learning CNN models recently in many computer vision benchmark tasks. This study explores the effectiveness of the Vision Transformer and the Swin Transformer, two cutting-edge variants of the transformer architecture, for the mobile-based oral cancer image classification application. The pre-trained Swin transformer model achieved 88.7% accuracy in the binary classification task, outperforming the ViT model by 2.3%, while the conventional convolutional network model VGG19 and ResNet50 achieved 85.2% and 84.5% accuracy. Our experiments demonstrate that these transformer-based architectures outperform traditional convolutional neural networks in terms of oral cancer image classification, and underscore the potential of the ViT and the Swin Transformer in advancing the state of the art in oral cancer image analysis.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] Vision transformer based convolutional neural network for breast cancer histopathological images classification
    ABIMOULOUD M.L.
    BENSID K.
    Elleuch M.
    Ammar M.B.
    KHERALLAH M.
    Multimedia Tools and Applications, 2024, 83 (39) : 86833 - 86868
  • [22] A Swin Transformer and Residualnetwork Combined Model for Breast Cancer Disease Multi-Classification Using Histopathological Images
    Jianjun Zhuang
    Xiaohui Wu
    Dongdong Meng
    Shenghua Jing
    Instrumentation, 2024, 11 (01) : 112 - 120
  • [23] BTS-ST: Swin transformer network for segmentation and classification of multimodality breast cancer images
    Iqbal, Ahmed
    Sharif, Muhammad
    KNOWLEDGE-BASED SYSTEMS, 2023, 267
  • [24] HEAL-SWIN: A Vision Transformer On The Sphere
    Carlsson, Oscar
    Gerken, Jan E.
    Linander, Hampus
    Spiess, Heiner
    Ohlsson, Fredrik
    Petersson, Christoffer
    Persson, Daniel
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 6067 - 6077
  • [25] Land Cover Classification for Polarimetric SAR Images Based on Vision Transformer
    Wang, Hongmiao
    Xing, Cheng
    Yin, Junjun
    Yang, Jian
    REMOTE SENSING, 2022, 14 (18)
  • [26] HYPERSPECTRAL AND MULTISPECTRAL IMAGES FUSION BASED ON PYRAMID SWIN TRANSFORMER
    Lang, Han
    Bao, Wenxing
    Feng, Wei
    Sun, Shasha
    Ma, Xuan
    Zhang, Xiaowu
    IGARSS 2024-2024 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, IGARSS 2024, 2024, : 3125 - 3128
  • [27] Hyperspectral and multispectral images fusion based on pyramid swin transformer
    Lang, Han
    Bao, Wenxing
    Feng, Wei
    Qu, Kewen
    Ma, Xuan
    Zhang, Xiaowu
    INFRARED PHYSICS & TECHNOLOGY, 2024, 143
  • [28] Mammographic Breast Composition Classification Using Swin Transformer Network
    Tsai, Kuen-Jang
    Yeh, Wei-Cheng
    Kao, Cheng-Yi
    Lin, Ming -Wei
    Hung, Chao -Ming
    Chi, Hung-Ying
    Yeh, Cheng-Yu
    Hwang, Shaw-Hwa
    SENSORS AND MATERIALS, 2024, 36 (05) : 1951 - 1957
  • [29] Classification of maize growth stages using the Swin Transformer model
    Fu L.
    Huang H.
    Wang H.
    Huang S.
    Chen D.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2022, 38 (14): : 191 - 200
  • [30] CLASSIFICATION AND DIAGNOSIS OF AUTISM SPECTRUM DISORDER USING SWIN TRANSFORMER
    Zhang, Heqian
    Wang, Zhaohui
    Zhan, Yuefu
    2023 IEEE 20TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI, 2023,