Exploring the synergies of hybrid convolutional neural network and Vision Transformer architectures for computer vision: A survey

被引:0
|
作者
Haruna, Yunusa [1 ]
Qin, Shiyin [1 ]
Chukkol, Abdulrahman Hamman Adama [2 ]
Yusuf, Abdulganiyu Abdu [3 ]
Bello, Isah [4 ]
Lawan, Adamu [5 ]
机构
[1] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing, Peoples R China
[2] Beijing Inst Technol, Sch Informat & Elect, Beijing, Peoples R China
[3] Beijing Inst Technol, Sch Comp Sci, Beijing, Peoples R China
[4] Tianjin Univ, Sch Elect & Informat Engn, Tianjin, Peoples R China
[5] Beihang Univ, Sch Comp Sci & Technol, Beijing, Peoples R China
关键词
Attention mechanism; Convolutional neural network; Hybrid models; Image classification; Object detection; Vision transformer;
D O I
10.1016/j.engappai.2025.110057
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The hybrid of Convolutional Neural Network (CNN) and Vision Transformer (ViT) architecture has emerged as a groundbreaking approach, pushing the boundaries of Computer Vision (CV), significantly advancing CV tasks such as image classification, object detection, and segmentation. This comprehensive review provides a thorough examination of the literature on state-of-the-art hybrid CNN-ViT architectures, exploring the synergies between these two approaches. The main content of this survey includes: (1) a background on the vanilla CNN and ViT, (2) systematic review of various taxonomic hybrid designs to explore the synergy achieved through merging CNN and ViT models, (3) comparative analysis, task-specific synergy and real-world application among various hybrid architectures, (4) challenges and future directions for hybrid models, (5) lastly, the survey concludes with a summary of key findings and recommendations. Through this exploration, the survey aims to serve as a guiding resource, enhancing understanding of the dynamics between CNN and ViT and their impact on future developments in CV.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Multispectral Plant Disease Detection with Vision Transformer-Convolutional Neural Network Hybrid Approaches
    De Silva, Malithi
    Brown, Dane
    SENSORS, 2023, 23 (20)
  • [2] CoVi-Net: A hybrid convolutional and vision transformer neural network for retinal vessel segmentation
    Jiang, Minshan
    Zhu, Yongfei
    Zhang, Xuedian
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 170
  • [3] Hyneter:Hybrid Network Transformer for Multiple Computer Vision Tasks
    Chen, Dong
    Miao, Duoqian
    Zhao, Xuerong
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (06) : 8773 - 8785
  • [4] Survey of Transformer Research in Computer Vision
    Li, Xiang
    Zhang, Tao
    Zhang, Zhe
    Wei, Hongyang
    Qian, Yurong
    Computer Engineering and Applications, 2023, 59 (01) : 1 - 14
  • [5] Bibliometric Analysis of the Application of Convolutional Neural Network in Computer Vision
    Chen, Huie
    Deng, Zhenjie
    IEEE ACCESS, 2020, 8 : 155417 - 155428
  • [6] Vision transformer meets convolutional neural network for plant disease classification
    Thakur, Poornima Singh
    Chaturvedi, Shubhangi
    Khanna, Pritee
    Sheorey, Tanuja
    Ojha, Aparajita
    ECOLOGICAL INFORMATICS, 2023, 77
  • [7] Survey of Vision Transformer in Low-Level Computer Vision
    Zhu, Kai
    Li, Li
    Zhang, Tong
    Jiang, Sheng
    Bie, Yiming
    Computer Engineering and Applications, 2024, 60 (04) : 39 - 56
  • [8] Effective Processing of Convolutional Neural Networks for Computer Vision: A Tutorial and Survey
    Tombe, Ronald
    Viriri, Serestina
    IETE TECHNICAL REVIEW, 2022, 39 (01) : 49 - 62
  • [9] Efficient knowledge distillation for hybrid models: A vision transformer-convolutional neural network to convolutional neural network approach for classifying remote sensing images
    Song, Huaxiang
    Yuan, Yuxuan
    Ouyang, Zhiwei
    Yang, Yu
    Xiang, Hui
    IET CYBER-SYSTEMS AND ROBOTICS, 2024, 6 (03)
  • [10] Computer Vision and Convolutional Neural Network for Dense Crowd Count Detection
    Sirisha, D.
    Prasad, S. Sambhu
    Kumar, Subodh
    ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, VOL 2, AITA 2023, 2024, 844 : 353 - 362