Comparison of Vision Transformers and Convolutional Neural Networks in Medical Image Analysis: A Systematic Review

被引:7
|
作者
Takahashi, Satoshi [1 ,2 ]
Sakaguchi, Yusuke [1 ,3 ]
Kouno, Nobuji [1 ,2 ,4 ]
Takasawa, Ken [1 ,2 ]
Ishizu, Kenichi [1 ]
Akagi, Yu [5 ]
Aoyama, Rina [1 ,6 ]
Teraya, Naoki [1 ,6 ]
Bolatkan, Amina [1 ,2 ]
Shinkai, Norio [1 ,2 ]
Machino, Hidenori [1 ,2 ]
Kobayashi, Kazuma [1 ,2 ]
Asada, Ken [1 ,2 ]
Komatsu, Masaaki [1 ,2 ]
Kaneko, Syuzo [1 ]
Sugiyama, Masashi [7 ]
Hamamoto, Ryuji [1 ,2 ]
机构
[1] Natl Canc Ctr, Res Inst, Div Med AI Res & Dev, 5-1-1 Tsukiji,Chuo Ku, Tokyo 1040045, Japan
[2] RIKEN, Ctr Adv Intelligence Project, Canc Translat Res Team, 1-4-1 Nihonbashi,Chuo Ku, Tokyo 1030027, Japan
[3] Univ Tokyo, Grad Sch Med, Dept Neurosurg, 7-3-1 Hongo Bunkyo ku, Tokyo 1138655, Japan
[4] Kyoto Univ, Grad Sch Med, Dept Surg, Yoshida konoe cho,Sakyo ku, Kyoto 6068303, Japan
[5] Univ Tokyo, Grad Sch Med, Dept Biomed Informat, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138655, Japan
[6] Showa Univ, Sch Med, Dept Obstet & Gynecol, 1-5-8 Hatanodai Shinagawa ku, Tokyo 1428666, Japan
[7] RIKEN, Ctr Adv Intelligence Project, Tokyo 1030027, Japan
基金
日本学术振兴会;
关键词
Artificial intelligence; Vision transformer; Convolutional neural network; Medical image analysis; Prior learning; SEGMENTATION;
D O I
10.1007/s10916-024-02105-8
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
In the rapidly evolving field of medical image analysis utilizing artificial intelligence (AI), the selection of appropriate computational models is critical for accurate diagnosis and patient care. This literature review provides a comprehensive comparison of vision transformers (ViTs) and convolutional neural networks (CNNs), the two leading techniques in the field of deep learning in medical imaging. We conducted a survey systematically. Particular attention was given to the robustness, computational efficiency, scalability, and accuracy of these models in handling complex medical datasets. The review incorporates findings from 36 studies and indicates a collective trend that transformer-based models, particularly ViTs, exhibit significant potential in diverse medical imaging tasks, showcasing superior performance when contrasted with conventional CNN models. Additionally, it is evident that pre-training is important for transformer applications. We expect this work to help researchers and practitioners select the most appropriate model for specific medical image analysis tasks, accounting for the current state of the art and future trends in the field.
引用
收藏
页数:22
相关论文
共 50 条
  • [31] Convolutional Neural Networks in Medical Imaging: A Review
    Lin, Anjie
    Su, Bianping
    Ning, Yihe
    Zhang, Longqing
    He, Yantao
    ADVANCES IN SWARM INTELLIGENCE, PT II, ICSI 2024, 2024, 14789 : 419 - 430
  • [32] Comparison between vision transformers and convolutional neural networks to predict non-small lung cancer recurrence
    Annarita Fanizzi
    Federico Fadda
    Maria Colomba Comes
    Samantha Bove
    Annamaria Catino
    Erika Di Benedetto
    Angelo Milella
    Michele Montrone
    Annalisa Nardone
    Clara Soranno
    Alessandro Rizzo
    Deniz Can Guven
    Domenico Galetta
    Raffaella Massafra
    Scientific Reports, 13 (1)
  • [33] Comparison between vision transformers and convolutional neural networks to predict non-small lung cancer recurrence
    Fanizzi, Annarita
    Fadda, Federico
    Comes, Maria Colomba
    Bove, Samantha
    Catino, Annamaria
    Di Benedetto, Erika
    Milella, Angelo
    Montrone, Michele
    Nardone, Annalisa
    Soranno, Clara
    Rizzo, Alessandro
    Guven, Deniz Can
    Galetta, Domenico
    Massafra, Raffaella
    SCIENTIFIC REPORTS, 2023, 13 (01):
  • [34] Head and Neck Cancer Segmentation in FDG PET Images: Performance Comparison of Convolutional Neural Networks and Vision Transformers
    Xiong, Xiaofan
    Smith, Brian J.
    Graves, Stephen A.
    Graham, Michael M.
    Buatti, John M.
    Beichel, Reinhard R.
    TOMOGRAPHY, 2023, 9 (05) : 1933 - 1948
  • [35] Convolutional neural networks in medical image understanding: a survey
    D. R. Sarvamangala
    Raghavendra V. Kulkarni
    Evolutionary Intelligence, 2022, 15 : 1 - 22
  • [36] Convolutional neural networks in medical image understanding: a survey
    Sarvamangala, D. R.
    Kulkarni, Raghavendra V.
    EVOLUTIONARY INTELLIGENCE, 2022, 15 (01) : 1 - 22
  • [37] Convolutional neural networks for image processing: An application in robot vision
    Browne, M
    Ghidary, SS
    AI 2003: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2003, 2903 : 641 - 652
  • [38] Ultrasound Image Analysis with Vision Transformers-Review
    Vafaeezadeh, Majid
    Behnam, Hamid
    Gifani, Parisa
    DIAGNOSTICS, 2024, 14 (05)
  • [39] Pooling in convolutional neural networks for medical image analysis: a survey and an empirical study
    Nirthika, Rajendran
    Manivannan, Siyamalan
    Ramanan, Amirthalingam
    Wang, Ruixuan
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (07): : 5321 - 5347
  • [40] Pooling in convolutional neural networks for medical image analysis: a survey and an empirical study
    Nirthika, Rajendran
    Manivannan, Siyamalan
    Ramanan, Amirthalingam
    Wang, Ruixuan
    Neural Computing and Applications, 2022, 34 (07) : 5321 - 5347