Lightweight Vision Transformer for damaged wheat detection and classification using spectrograms

被引:0
|
作者
Lin, Hao [1 ]
Guo, Min [1 ]
Ma, Miao [1 ]
机构
[1] Shaanxi Normal Univ, Sch Comp Sci, Key Lab Modern Teaching Technol, Minist Educ, Xian, Peoples R China
基金
中国国家自然科学基金;
关键词
neural architecture search; auto machine learning; wheat kernels; classification; spectrogram;
D O I
10.1117/1.JEI.33.5.053063
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Grain is one of the basic human necessities, and its quality and safety directly impact human dietary health. Various issues occur during grain storage, primarily mold and pest infestation. With the development of artificial intelligence, increasingly, more technologies are applied to grain detection and classification. Transformer-based models are becoming popular in grain detection. Although transformer models exhibit excellent performance, they are often large and cumbersome, limiting practical applications. We propose a framework named KD-ASF based on intermediate layer knowledge distillation and one-shot neural architecture search, to optimize the hyperparameters of vision transformer (ViT) for detecting and classifying molded wheat kernels (MDK), Insect-Damaged wheat kernels (IDK), and undamaged wheat kernels (UDK). In KD-ASF, we use the ViT model as our teacher network. Next, we design a search space containing adjustable hyperparameters of transformer building blocks. The super-network stacks maximum transformer building blocks and is trained under the guidance of the teacher network. Subsequently, the trained super-network undergoes evolutionary search, and the resulting networks are used for classifying different wheat kernels. We conducted experiments using a five-fold cross-validation approach and obtained an F1 score of 97.13%, and the last model parameter size is only 5.94M. The results demonstrate that this method not only outperforms the majority of neural networks in terms of performance but also has a significantly smaller model size than most network models. Its lightweight nature facilitates easy deployment and application. These findings indicate that the structure of KD-ASF is feasible and effective. (c) 2024 SPIE and IS&T
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Vision Transformers for Anomaly Classification and Localization in Optical Networks Using SOP Spectrograms
    Abdelli, Khouloud
    Lonardi, Matteo
    Boitier, Fabien
    Correa, Diego
    Gripp, Jurgen
    Olsson, Samuel
    Layec, Patricia
    JOURNAL OF LIGHTWAVE TECHNOLOGY, 2025, 43 (04) : 1902 - 1914
  • [22] A lightweight vision transformer with symmetric modules for vision tasks
    Liang, Shengjun
    Yu, Mingxin
    Lu, Wenshuai
    Ji, Xinglong
    Tang, Xiongxin
    Liu, Xiaolin
    You, Rui
    INTELLIGENT DATA ANALYSIS, 2023, 27 (06) : 1741 - 1757
  • [23] DETECTION, ESTIMATION, AND CLASSIFICATION WITH SPECTROGRAMS.
    Altes, Richard A.
    1600, (67):
  • [24] HELViT: highly efficient lightweight vision transformer for remote sensing image scene classification
    Dongen Guo
    Zechen Wu
    Jiangfan Feng
    Zhuoke Zhou
    Zhen Shen
    Applied Intelligence, 2023, 53 : 24947 - 24962
  • [25] HELViT: highly efficient lightweight vision transformer for remote sensing image scene classification
    Guo, Dongen
    Wu, Zechen
    Feng, Jiangfan
    Zhou, Zhuoke
    Shen, Zhen
    APPLIED INTELLIGENCE, 2023, 53 (21) : 24947 - 24962
  • [26] A lightweight vision transformer with embedded hybrid attention for quick response code defect classification
    Hu, Dianlu
    Zhao, Lun
    Ren, Yu
    Wang, Sen
    Ye, Xuanlin
    Zhang, Haohan
    Peng, Changqing
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 150
  • [27] Evaluation of Environmental Sound Classification using Vision Transformer
    Wang, Changlong
    Ito, Akinori
    Nose, Takashi
    Chen, Chia-Ping
    2024 16TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING, ICMLC 2024, 2024, : 665 - 669
  • [28] Correction: A lightweight classification of adaptor proteins using transformer networks
    Sylwan Rahardja
    Mou Wang
    Binh P. Nguyen
    Pasi Fränti
    Susanto Rahardja
    BMC Bioinformatics, 24
  • [29] An effective human monkeypox classification using vision transformer
    Aloraini, Mohammed
    INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, 2024, 34 (01)
  • [30] Image Classification Using Vision Transformer for EtC Images
    Hamano, Genki
    Imaizumi, Shoko
    Kiya, Hitoshi
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1506 - 1513