Enhancing Computer Vision Performance: A Hybrid Deep Learning Approach with CNNs and Vision Transformers

被引:1
|
作者
Sardar, Abha Singh [1 ]
Ranjan, Vivek [1 ]
机构
[1] Maulana Azad Natl Inst Technol, Dept Comp Sci & Engn, Bhopal, Madhya Pradesh, India
来源
COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT II | 2024年 / 2010卷
关键词
Convolutional Neural Networks (CNNs); Vision Transformers (ViTs); Image classification; Plant Disease; Limited data;
D O I
10.1007/978-3-031-58174-8_49
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article explores the growing prominence of deep learning algorithms in computer vision tasks, focusing on the strengths and weaknesses of Convolutional Neural Networks and Vision Transformers (ViTs). Convolutional Neural Network (CNNs) have dominated computer vision tasks since their inception due to their ability to identify features irrespective of their location, scale, or orientation. However, their efficiency is limited, particularly in managing long-range dependencies. Conversely, Vision Transformers (ViTs), while high performing, are "data-hungry" and require substantial training data to reach their full potential, posing a significant obstacle in areas with limited data availability such as healthcare and plant pathology. To address these limitations, we propose a hybrid approach that integrates the strengths of both CNNs and ViTs, aiming to create a robust model that is efficient with a range of data sizes. Testing on the Plant Disease and Tomato Leaf Disease Classification datasets demonstrates the efficacy of our model, with a marked improvement in F1 score, accuracy, and a significant reduction in loss compared to the base CNN. These findings demonstrate the potential of the suggested method in identifying plant diseases, making a significant contribution to advancements in agricultural technology. This research initiates a crucial discussion on balancing performance and practical data constraints in the fast-evolving field of deep learning.
引用
收藏
页码:591 / 602
页数:12
相关论文
共 50 条
  • [21] Special focus on deep learning for computer vision
    Xiang Bai
    Yanwei Pang
    Guofeng Zhang
    Science China Information Sciences, 2020, 63
  • [22] Applications and Challenges of Deep Learning in Computer Vision
    Singh, Chetanpal
    HEALTH INFORMATION SCIENCE, HIS 2021, 2021, 13079 : 223 - 233
  • [23] Special focus on deep learning for computer vision
    Xiang BAI
    Yanwei PANG
    Guofeng ZHANG
    Science China(Information Sciences), 2020, 63 (02) : 5 - 6
  • [24] Special focus on deep learning for computer vision
    Yanwei Pang
    Xiang Bai
    Guofeng Zhang
    Science China Information Sciences, 2019, 62
  • [25] Guest Editorial: Deep Learning in Computer Vision
    Hospedales, Timothy
    Romero, Adriana
    Vazquez, David
    IET COMPUTER VISION, 2017, 11 (08) : 621 - 622
  • [26] Deep Learning for Computer Vision: A Brief Review
    Voulodimos, Athanasios
    Doulamis, Nikolaos
    Doulamis, Anastasios
    Protopapadakis, Eftychios
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2018, 2018
  • [27] Special focus on deep learning for computer vision
    Bai, Xiang
    Pang, Yanwei
    Zhang, Guofeng
    SCIENCE CHINA-INFORMATION SCIENCES, 2020, 63 (02)
  • [28] Leveraging Deep Learning for Computer Vision: A Review
    Alam, Ekram
    Abu Sufian
    Das, Akhil Kumar
    Bhattacharya, Arijit
    Ali, Md Firoj
    Rahman, M. M. Hafizur
    2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 298 - 305
  • [29] Editorial-Deep Learning for Computer Vision
    Girshick, Ross
    Kokkinos, Iasonas
    Laptev, Ivan
    Malik, Jitendra
    Papandreou, George
    Vedaldi, Andrea
    Wang, Xiaogang
    Yan, Shuicheng
    Yuille, Alan
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2017, 164 : 1 - 2
  • [30] Computer Vision and Deep Learning for Precision Viticulture
    Mohimont, Lucas
    Alin, Francois
    Rondeau, Marine
    Gaveau, Nathalie
    Steffenel, Luiz Angelo
    AGRONOMY-BASEL, 2022, 12 (10):