Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review

被引:7
|
作者
Takahashi, Satoshi [1 ,2 ]
Sakaguchi, Yusuke [1 ,3 ]
Kouno, Nobuji [1 ,2 ,4 ]
Takasawa, Ken [1 ,2 ]
Ishizu, Kenichi [1 ]
Akagi, Yu [5 ]
Aoyama, Rina [1 ,6 ]
Teraya, Naoki [1 ,6 ]
Bolatkan, Amina [1 ,2 ]
Shinkai, Norio [1 ,2 ]
Machino, Hidenori [1 ,2 ]
Kobayashi, Kazuma [1 ,2 ]
Asada, Ken [1 ,2 ]
Komatsu, Masaaki [1 ,2 ]
Kaneko, Syuzo [1 ]
Sugiyama, Masashi [7 ]
Hamamoto, Ryuji [1 ,2 ]
机构
[1] Natl Canc Ctr, Res Inst, Div Med AI Res & Dev, 5-1-1 Tsukiji,Chuo Ku, Tokyo 1040045, Japan
[2] RIKEN, Ctr Adv Intelligence Project, Canc Translat Res Team, 1-4-1 Nihonbashi,Chuo Ku, Tokyo 1030027, Japan
[3] Univ Tokyo, Grad Sch Med, Dept Neurosurg, 7-3-1 Hongo Bunkyo ku, Tokyo 1138655, Japan
[4] Kyoto Univ, Grad Sch Med, Dept Surg, Yoshida konoe cho,Sakyo ku, Kyoto 6068303, Japan
[5] Univ Tokyo, Grad Sch Med, Dept Biomed Informat, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138655, Japan
[6] Showa Univ, Sch Med, Dept Obstet & Gynecol, 1-5-8 Hatanodai Shinagawa ku, Tokyo 1428666, Japan
[7] RIKEN, Ctr Adv Intelligence Project, Tokyo 1030027, Japan
基金
日本学术振兴会;
关键词
Artificial intelligence; Vision transformer; Convolutional neural network; Medical image analysis; Prior learning; SEGMENTATION;
D O I
10.1007/s10916-024-02105-8
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
In the rapidly evolving field of medical image analysis utilizing artificial intelligence (AI), the selection of appropriate computational models is critical for accurate diagnosis and patient care. This literature review provides a comprehensive comparison of vision transformers (ViTs) and convolutional neural networks (CNNs), the two leading techniques in the field of deep learning in medical imaging. We conducted a survey systematically. Particular attention was given to the robustness, computational efficiency, scalability, and accuracy of these models in handling complex medical datasets. The review incorporates findings from 36 studies and indicates a collective trend that transformer-based models, particularly ViTs, exhibit significant potential in diverse medical imaging tasks, showcasing superior performance when contrasted with conventional CNN models. Additionally, it is evident that pre-training is important for transformer applications. We expect this work to help researchers and practitioners select the most appropriate model for specific medical image analysis tasks, accounting for the current state of the art and future trends in the field.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Dynamic Spatial Sparsification for Efficient Vision Transformers and Convolutional Neural Networks
    Rao, Yongming
    Liu, Zuyan
    Zhao, Wenliang
    Zhou, Jie
    Lu, Jiwen
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (09) : 10883 - 10897
  • [22] Understanding and improving adversarial transferability of vision transformers and convolutional neural networks
    Chen, Zhiyu
    Xu, Chi
    Lv, Huanhuan
    Liu, Shangdong
    Ji, Yimu
    INFORMATION SCIENCES, 2023, 648
  • [23] A review of convolutional neural networks in computer vision
    Xia Zhao
    Limin Wang
    Yufei Zhang
    Xuming Han
    Muhammet Deveci
    Milan Parmar
    Artificial Intelligence Review, 57
  • [24] A review of convolutional neural networks in computer vision
    Zhao, Xia
    Wang, Limin
    Zhang, Yufei
    Han, Xuming
    Deveci, Muhammet
    Parmar, Milan
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 57 (04)
  • [25] Comparative Analysis of Vision Transformers and Conventional Convolutional Neural Networks in Detecting Referable Diabetic Retinopathy
    Goh, Jocelyn Hui Lin
    Ang, Elroy
    Srinivasan, Sahana
    Lei, Xiaofeng
    Loh, Johnathan
    Quek, Ten Cheer
    Xue, Cancan
    Xu, Xinxing
    Liu, Yong
    Cheng, Ching-Yu
    Rajapakse, Jagath C.
    Tham, Yih-Chung
    OPHTHALMOLOGY SCIENCE, 2024, 4 (06):
  • [26] A systematic review of vision transformers and convolutional neural networks for Alzheimer’s disease classification using 3D MRI images
    Bravo-Ortiz, Mario Alejandro
    Holguin-Garcia, Sergio Alejandro
    Quiñones-Arredondo, Sebastián
    Mora-Rubio, Alejandro
    Guevara-Navarro, Ernesto
    Arteaga-Arteaga, Harold Brayan
    Ruz, Gonzalo A.
    Tabares-Soto, Reinel
    Neural Computing and Applications, 2024, 36 (35) : 21985 - 22012
  • [27] MoViT: Memorizing Vision Transformers for Medical Image Analysis
    Shen, Yiqing
    Guo, Pengfei
    Wu, Jingpu
    Huang, Qianqi
    Le, Nhat
    Zhou, Jinyuan
    Jiang, Shanshan
    Unberath, Mathias
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2023, PT II, 2024, 14349 : 205 - 213
  • [28] On the Use of Convolutional Neural Networks with Patterned Stride for Medical Image Analysis
    Zaniolo, Luiz
    Marques, Oge
    Machine Graphics and Vision, 2021, 30 (01): : 3 - 22
  • [29] Convformer: Dual-Stream Vision Transformers and Convolutional Networks for Image Restoration
    Yang, Changzhi
    Pan, Huihui
    Wang, Jue
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [30] Bridging the Gap Between Vision Transformers and Convolutional Neural Networks on Small Datasets
    Lu, Zhiying
    Xie, Hongtao
    Liu, Chuanbin
    Zhang, Yongdong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,