Pest-ConFormer: A hybrid CNN-Transformer architecture for large-scale multi-class crop pest recognition

被引:2
|
作者
Fang, Mingwei [1 ]
Tan, Zhiping [1 ]
Tang, Yu [1 ]
Chen, Weizhao [1 ]
Huang, Huasheng [1 ]
Dananjayan, Sathian [2 ]
He, Yong [3 ]
Luo, Shaoming [4 ]
机构
[1] Guangdong Polytech Normal Univ, Interdisciplinary Studies, Guangzhou, Peoples R China
[2] Vellore Inst Technol, Sch Comp Sci & Engn, Chennai, Tamilnadu, India
[3] Zhejiang Univ, Coll Biosyst Engn & Food Sci, Hangzhou, Peoples R China
[4] Foshan Univ, Sch Mechatron Engn & Automat, Foshan, Peoples R China
关键词
Crop pest classification; Transformer; Graph Convolutional Network; Fine-grained visual classification; NEURAL-NETWORK; IDENTIFICATION;
D O I
10.1016/j.eswa.2024.124833
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Crop pests are acknowledged as the major factors in reducing the yield and quality of agricultural production worldwide. It is an urgent necessity to recognize crop pests accurately to protect the crop in the early stage to reduce the loss for the agricultural economy. Due to the ecological characteristics of the crop pests and the complex background in fields, the crop pests show high inter-class similarity and significant intra-class variation in external morphology appearance, which makes current recognition methods suffer from low classification accuracy and poor generalization ability in complex natural environment recognition tasks. To tackle this problem, a hybrid convolutional neural network and transformer-based model, namely Pest-ConFormer, featured with multi-scale weakly supervised feature selection mechanisms is proposed, which has shown excellent multiscale discriminative feature extraction in fine-grained visual classification (FGVC) tasks. This method employs a hybrid convolution-transformer encoder architecture pre-training in a self-supervised masked autoencoder manner as a backbone to learn pests' highly discriminative features across various scales. Next, a dual-path feature aggregation structure with a top-down FPN-like feature pathway and a bottom-up PANet-like feature pathway based on attention mechanisms is designed to learn high-level global context information and low-level local detailed feature representation. Thirdly, a fine-grained classification module using weakly supervised learning is introduced to select the discriminative feature points in different pyramidal levels. Then, these feature points are fed into a graph convolutional network to accomplish classification. Several experiments are conducted on the large-scale multi-class IP102 benchmark dataset, and the proposed method achieves an accuracy of 77.81 % regarding crop pest recognition. The experimental results indicate that our approach outperforms other state-of-the-art methods by nearly 2 percent points, demonstrating that the proposed hybrid architecture with dual-path feature aggregation and fine-grained classification modules can be more effective in the crop pest recognition field than CNN-based methods and can be deployed in the practical natural environment. The source code will be available at https://github.com/mwfang/pestconformer.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Pest-YOLO: A model for large-scale multi-class dense and tiny pest detection and counting
    Wen, Changji
    Chen, Hongrui
    Ma, Zhenyu
    Zhang, Tian
    Yang, Ce
    Su, Hengqiang
    Chen, Hongbing
    FRONTIERS IN PLANT SCIENCE, 2022, 13
  • [2] Hybrid CNN-Transformer Architecture for Efficient Large-Scale Video Snapshot Compressive Imaging
    Cao, Miao
    Wang, Lishun
    Zhu, Mingyu
    Yuan, Xin
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, 132 (10) : 4521 - 4540
  • [3] PestNet: An End-to-End Deep Learning Approach for Large-Scale Multi-Class Pest Detection and Classification
    Liu, Liu
    Wang, Rujing
    Xie, Chengjun
    Yang, Po
    Wang, Fangyuan
    Sudirman, Sud
    Liu, Wancai
    IEEE ACCESS, 2019, 7 : 45301 - 45312
  • [4] Model for the recognition of large-scale multi-class diseases and pests
    Wen C.
    Wang Q.
    Chen H.
    Wu J.
    Ni J.
    Yang C.
    Su H.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2022, 38 (08): : 169 - 177
  • [5] Deep Learning based Automatic Approach using Hybrid Global and Local Activated Features towards Large-scale Multi-class Pest Monitoring
    Liu, Liu
    Wang, Rujing
    Xie, Chengjun
    Yang, Po
    Sudirman, Sud
    Wang, Fangyuan
    Li, Rui
    2019 IEEE 17TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2019, : 1507 - 1510
  • [6] Pest-PVT: A model for multi-class and dense pest detection and counting in field-scale environments
    Chen, Hongrui
    Wen, Changji
    Zhang, Long
    Ma, Zhenyu
    Liu, Tianyu
    Wang, Guangyao
    Yu, Helong
    Yang, Ce
    Yuan, Xiaohui
    Ren, Junfeng
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 230
  • [7] Agricultural innovation through deep learning: a hybrid CNN-Transformer architecture for crop disease classification
    Padshetty, Smitha
    Umashetty, Ambika
    JOURNAL OF SPATIAL SCIENCE, 2024,
  • [8] UTNETPARA: A HYBRID CNN-TRANSFORMER ARCHITECTURE WITH MULTI-SCALE FUSION FOR WHOLE-SLIDE IMAGE SEGMENTATION
    Huang, Boqiang
    Ying, Jiayu
    Lyu, Ruizhi
    Schaadt, Nadine S.
    Klinkhammer, Barbara M.
    Boor, Peter
    Lotz, Johannes
    Feuerhake, Friedrich
    Merhof, Dorit
    IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING, ISBI 2024, 2024,
  • [9] Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition
    Froehlich, Bjoern
    Rodner, Erik
    Kemmler, Michael
    Denzler, Joachim
    MACHINE VISION AND APPLICATIONS, 2013, 24 (05) : 1043 - 1053
  • [10] Large-scale gaussian process multi-class classification for semantic segmentation and facade recognition
    Björn Fröhlich
    Erik Rodner
    Michael Kemmler
    Joachim Denzler
    Machine Vision and Applications, 2013, 24 : 1043 - 1053