Enhancing Computer Vision Performance: A Hybrid Deep Learning Approach with CNNs and Vision Transformers

被引：1

作者：

Sardar, Abha Singh ^{[1
]}

Ranjan, Vivek ^{[1
]}

机构：

[1] Maulana Azad Natl Inst Technol, Dept Comp Sci & Engn, Bhopal, Madhya Pradesh, India

来源：

COMPUTER VISION AND IMAGE PROCESSING, CVIP 2023, PT II | 2024年 / 2010卷

关键词：

Convolutional Neural Networks (CNNs); Vision Transformers (ViTs); Image classification; Plant Disease; Limited data;

D O I：

10.1007/978-3-031-58174-8_49

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article explores the growing prominence of deep learning algorithms in computer vision tasks, focusing on the strengths and weaknesses of Convolutional Neural Networks and Vision Transformers (ViTs). Convolutional Neural Network (CNNs) have dominated computer vision tasks since their inception due to their ability to identify features irrespective of their location, scale, or orientation. However, their efficiency is limited, particularly in managing long-range dependencies. Conversely, Vision Transformers (ViTs), while high performing, are "data-hungry" and require substantial training data to reach their full potential, posing a significant obstacle in areas with limited data availability such as healthcare and plant pathology. To address these limitations, we propose a hybrid approach that integrates the strengths of both CNNs and ViTs, aiming to create a robust model that is efficient with a range of data sizes. Testing on the Plant Disease and Tomato Leaf Disease Classification datasets demonstrates the efficacy of our model, with a marked improvement in F1 score, accuracy, and a significant reduction in loss compared to the base CNN. These findings demonstrate the potential of the suggested method in identifying plant diseases, making a significant contribution to advancements in agricultural technology. This research initiates a crucial discussion on balancing performance and practical data constraints in the fast-evolving field of deep learning.

引用

页码：591 / 602

页数：12

共 50 条

[41] Enhancing the transferability of adversarial examples on vision transformers
Guan, Yujiao
Yang, Haoyu
Qu, Xiaotong
Wang, Xiaodong
JOURNAL OF ELECTRONIC IMAGING, 2024, 33 (02)
[42] Evolutionary deep learning for computer vision and image processing
Al-Sahaf, Harith
Mesejo, Pablo
Bi, Ying
Zhang, Mengjie
APPLIED SOFT COMPUTING, 2024, 151
[43] Deep learning-enabled medical computer vision
Esteva, Andre
Chou, Katherine
Yeung, Serena
Naik, Nikhil
Madani, Ali
Mottaghi, Ali
Liu, Yun
Topol, Eric
Dean, Jeff
Socher, Richard
NPJ DIGITAL MEDICINE, 2021, 4 (01)
[44] Computer vision with deep learning for ship draft reading
Wang, Bangping
Liu, Zhiming
Wang, Haoran
OPTICAL ENGINEERING, 2021, 60 (02)
[45] Deep Learning vs. Traditional Computer Vision
O'Mahony, Niall
Campbell, Sean
Carvalho, Anderson
Harapanahalli, Suman
Hernandez, Gustavo Velasco
Krpalkova, Lenka
Riordan, Daniel
Walsh, Joseph
ADVANCES IN COMPUTER VISION, CVC, VOL 1, 2020, 943 : 128 - 144
[46] Advances in solar forecasting: Computer vision with deep learning
Paletta, Quentin
Terren-Serrano, Guillermo
Nie, Yuhao
Li, Binghui
Bieker, Jacob
Zhang, Wenqi
Dubus, Laurent
Dev, Soumyabrata
Feng, Cong
ADVANCES IN APPLIED ENERGY, 2023, 11
[47] Application of Deep Learning to Computer Vision: A Comprehensive Study
Islam, S. M. Sofiqul
Rahman, Shanto
Rahman, Md. Mostafijur
Dey, Emon Kumar
Shoyaib, Mohammad
2016 5TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS AND VISION (ICIEV), 2016, : 592 - 597
[48] Deep learning-enabled medical computer vision
Andre Esteva
Katherine Chou
Serena Yeung
Nikhil Naik
Ali Madani
Ali Mottaghi
Yun Liu
Eric Topol
Jeff Dean
Richard Socher
npj Digital Medicine, 4
[49] Deep reinforcement learning in computer vision: a comprehensive survey
Le, Ngan
Rathour, Vidhiwar Singh
Yamazaki, Kashu
Luu, Khoa
Savvides, Marios
ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (04) : 2733 - 2819
[50] Improving landslide prediction by computer vision and deep learning
Guerrero-Rodriguez, Byron
Garcia-Rodriguez, Jose
Salvador, Jaime
Mejia-Escobar, Christian
Cadena, Shirley
Cepeda, Jairo
Benavent-Lledo, Manuel
Mulero-Perez, David
INTEGRATED COMPUTER-AIDED ENGINEERING, 2024, 31 (01) : 77 - 94

← 1 2 3 4 5 →