Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review

被引:0
|
作者
Takahashi, Satoshi [1 ,2 ]
Sakaguchi, Yusuke [1 ,3 ]
Kouno, Nobuji [1 ,2 ,4 ]
Takasawa, Ken [1 ,2 ]
Ishizu, Kenichi [1 ]
Akagi, Yu [5 ]
Aoyama, Rina [1 ,6 ]
Teraya, Naoki [1 ,6 ]
Bolatkan, Amina [1 ,2 ]
Shinkai, Norio [1 ,2 ]
Machino, Hidenori [1 ,2 ]
Kobayashi, Kazuma [1 ,2 ]
Asada, Ken [1 ,2 ]
Komatsu, Masaaki [1 ,2 ]
Kaneko, Syuzo [1 ]
Sugiyama, Masashi [7 ]
Hamamoto, Ryuji [1 ,2 ]
机构
[1] Natl Canc Ctr, Res Inst, Div Med AI Res & Dev, 5-1-1 Tsukiji,Chuo Ku, Tokyo 1040045, Japan
[2] RIKEN, Ctr Adv Intelligence Project, Canc Translat Res Team, 1-4-1 Nihonbashi,Chuo Ku, Tokyo 1030027, Japan
[3] Univ Tokyo, Grad Sch Med, Dept Neurosurg, 7-3-1 Hongo Bunkyo ku, Tokyo 1138655, Japan
[4] Kyoto Univ, Grad Sch Med, Dept Surg, Yoshida konoe cho,Sakyo ku, Kyoto 6068303, Japan
[5] Univ Tokyo, Grad Sch Med, Dept Biomed Informat, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138655, Japan
[6] Showa Univ, Sch Med, Dept Obstet & Gynecol, 1-5-8 Hatanodai Shinagawa ku, Tokyo 1428666, Japan
[7] RIKEN, Ctr Adv Intelligence Project, Tokyo 1030027, Japan
基金
日本学术振兴会;
关键词
Artificial intelligence; Vision transformer; Convolutional neural network; Medical image analysis; Prior learning; SEGMENTATION;
D O I
10.1007/s10916-024-02105-8
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
In the rapidly evolving field of medical image analysis utilizing artificial intelligence (AI), the selection of appropriate computational models is critical for accurate diagnosis and patient care. This literature review provides a comprehensive comparison of vision transformers (ViTs) and convolutional neural networks (CNNs), the two leading techniques in the field of deep learning in medical imaging. We conducted a survey systematically. Particular attention was given to the robustness, computational efficiency, scalability, and accuracy of these models in handling complex medical datasets. The review incorporates findings from 36 studies and indicates a collective trend that transformer-based models, particularly ViTs, exhibit significant potential in diverse medical imaging tasks, showcasing superior performance when contrasted with conventional CNN models. Additionally, it is evident that pre-training is important for transformer applications. We expect this work to help researchers and practitioners select the most appropriate model for specific medical image analysis tasks, accounting for the current state of the art and future trends in the field.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Comparing Vision Transformers and Convolutional Neural Networks for Image Classification: A Literature Review
    Mauricio, Jose
    Domingues, Ines
    Bernardino, Jorge
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (09):
  • [2] Visualization Comparison of Vision Transformers and Convolutional Neural Networks
    Shi, Rui
    Li, Tianxing
    Zhang, Liguo
    Yamaguchi, Yasushi
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2327 - 2339
  • [3] Medical Image Analysis using Convolutional Neural Networks: A Review
    Syed Muhammad Anwar
    Muhammad Majid
    Adnan Qayyum
    Muhammad Awais
    Majdi Alnowami
    Muhammad Khurram Khan
    [J]. Journal of Medical Systems, 2018, 42
  • [4] Medical Image Analysis using Convolutional Neural Networks: A Review
    Anwar, Syed Muhammad
    Majid, Muhammad
    Qayyum, Adnan
    Awais, Muhammad
    Alnowami, Majdi
    Khan, Muhammad Khurram
    [J]. JOURNAL OF MEDICAL SYSTEMS, 2018, 42 (11)
  • [5] CiT-Net: Convolutional Neural Networks Hand in Hand with Vision Transformers for Medical Image Segmentation
    Lei, Tao
    Sun, Rui
    Wang, Xuan
    Wang, Yingbo
    He, Xi
    Nandi, Asoke
    [J]. PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 1017 - 1025
  • [6] Advances in medical image analysis with vision Transformers: A review
    Azad, Reza
    Kazerouni, Amirhossein
    Heidari, Moein
    Aghdam, Ehsan Khodapanah
    Molaei, Amirali
    Jia, Yiwei
    Jose, Abin
    Roy, Rijo
    Merhof, Dorit
    [J]. MEDICAL IMAGE ANALYSIS, 2024, 91
  • [7] CMT: Convolutional Neural Networks Meet Vision Transformers
    Guo, Jianyuan
    Han, Kai
    Wu, Han
    Tang, Yehui
    Chen, Xinghao
    Wang, Yunhe
    Xu, Chang
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12165 - 12175
  • [8] Evaluating Convolutional Neural Networks and Vision Transformers for Baby Cry Sound Analysis
    Younis, Samir A.
    Sobhy, Dalia
    Tawfik, Noha S.
    [J]. FUTURE INTERNET, 2024, 16 (07)
  • [9] Comprehensive comparison between vision transformers and convolutional neural networks for face recognition tasks
    Rodrigo, Marcos
    Cuevas, Carlos
    Garcia, Narciso
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01):
  • [10] Adversarial Robustness of Vision Transformers Versus Convolutional Neural Networks
    Ali, Kazim
    Bhatti, Muhammad Shahid
    Saeed, Atif
    Athar, Atifa
    Al Ghamdi, Mohammed A.
    Almotiri, Sultan H.
    Akram, Samina
    [J]. IEEE ACCESS, 2024, 12 : 105281 - 105293