Pest-ConFormer: A hybrid CNN-Transformer architecture for large-scale multi-class crop pest recognition

被引:2
|
作者
Fang, Mingwei [1 ]
Tan, Zhiping [1 ]
Tang, Yu [1 ]
Chen, Weizhao [1 ]
Huang, Huasheng [1 ]
Dananjayan, Sathian [2 ]
He, Yong [3 ]
Luo, Shaoming [4 ]
机构
[1] Guangdong Polytech Normal Univ, Interdisciplinary Studies, Guangzhou, Peoples R China
[2] Vellore Inst Technol, Sch Comp Sci & Engn, Chennai, Tamilnadu, India
[3] Zhejiang Univ, Coll Biosyst Engn & Food Sci, Hangzhou, Peoples R China
[4] Foshan Univ, Sch Mechatron Engn & Automat, Foshan, Peoples R China
关键词
Crop pest classification; Transformer; Graph Convolutional Network; Fine-grained visual classification; NEURAL-NETWORK; IDENTIFICATION;
D O I
10.1016/j.eswa.2024.124833
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crop pests are acknowledged as the major factors in reducing the yield and quality of agricultural production worldwide. It is an urgent necessity to recognize crop pests accurately to protect the crop in the early stage to reduce the loss for the agricultural economy. Due to the ecological characteristics of the crop pests and the complex background in fields, the crop pests show high inter-class similarity and significant intra-class variation in external morphology appearance, which makes current recognition methods suffer from low classification accuracy and poor generalization ability in complex natural environment recognition tasks. To tackle this problem, a hybrid convolutional neural network and transformer-based model, namely Pest-ConFormer, featured with multi-scale weakly supervised feature selection mechanisms is proposed, which has shown excellent multiscale discriminative feature extraction in fine-grained visual classification (FGVC) tasks. This method employs a hybrid convolution-transformer encoder architecture pre-training in a self-supervised masked autoencoder manner as a backbone to learn pests' highly discriminative features across various scales. Next, a dual-path feature aggregation structure with a top-down FPN-like feature pathway and a bottom-up PANet-like feature pathway based on attention mechanisms is designed to learn high-level global context information and low-level local detailed feature representation. Thirdly, a fine-grained classification module using weakly supervised learning is introduced to select the discriminative feature points in different pyramidal levels. Then, these feature points are fed into a graph convolutional network to accomplish classification. Several experiments are conducted on the large-scale multi-class IP102 benchmark dataset, and the proposed method achieves an accuracy of 77.81 % regarding crop pest recognition. The experimental results indicate that our approach outperforms other state-of-the-art methods by nearly 2 percent points, demonstrating that the proposed hybrid architecture with dual-path feature aggregation and fine-grained classification modules can be more effective in the crop pest recognition field than CNN-based methods and can be deployed in the practical natural environment. The source code will be available at https://github.com/mwfang/pestconformer.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Quantum-Enhanced Support Vector Machine for Large-Scale Multi-class Stellar Classification
    Chen, Kuan-Cheng
    Xu, Xiaotian
    Makhanov, Henry
    Chung, Hui-Hsuan
    Liu, Chen-Yu
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT X, ICIC 2024, 2024, 14871 : 155 - 168
  • [32] Incremental Parallel Support Vector Machines for Classifying Large-Scale Multi-class Image Datasets
    Thanh-Nghi Do
    Tran-Nguyen, Minh-Thu
    FUTURE DATA AND SECURITY ENGINEERING, FDSE 2016, 2016, 10018 : 20 - 39
  • [33] ISTD-CrackNet: Hybrid CNN-transformer models focusing on fine-grained segmentation of multi-scale pavement cracks
    Zhang, Zaiyan
    Zhuang, Yangyang
    Song, Weidong
    Wu, Jiachen
    Ye, Xin
    Zhang, Hongyue
    Xu, Yanli
    Shi, Guoli
    MEASUREMENT, 2025, 251
  • [34] Large-scale orthogonal integer wavelet transform features-based active support vector machine for multi-class face recognition
    Dalal, Tanvi
    Yadav, Jyotsna
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2023, 72 (02) : 108 - 124
  • [35] A Deep Learning Approach for Crop Disease and Pest Classification Using Swin Transformer and Dual-Attention Multi-Scale Fusion Network
    Karthik, R.
    Ajay, Armaano
    Singh Bisht, Akshaj
    Illakiya, T.
    Suganthi, K.
    IEEE ACCESS, 2024, 12 : 152639 - 152655
  • [36] hi-RF: Incremental Learning Random Forest for Large-Scale Multi-class Data Classification
    Xie, Tingting
    Wang, Changjian
    Peng, Yuxing
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND INDUSTRIAL ENGINEERING (AIIE 2016), 2016, 133 : 312 - 321
  • [37] Robust weighted linear loss twin multi-class support vector regression for large-scale classification
    Qiang, Wenwen
    Zhang, Jinxin
    Zhen, Ling
    Jing, Ling
    SIGNAL PROCESSING, 2020, 170
  • [38] Latent-lSVM classification of very high-dimensional and large-scale multi-class datasets
    Thanh-Nghi Do
    Poulet, Francois
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (02):
  • [39] Collaborative and dynamic kernel discriminant analysis for large-scale problems: applications in multi-class learning and novelty detection
    Dufrenois, F.
    Khatib, A.
    Hamlich, M.
    Hamad, D.
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2024,
  • [40] Pest24: A large-scale very small object data set of agricultural pests for multi-target detection
    Wang, Qi-Jin
    Zhang, Sheng-Yu
    Dong, Shi-Feng
    Zhang, Guang-Cai
    Yang, Jin
    Li, Rui
    Wang, Hong-Qiang
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2020, 175