Lightweight Vision Transformer for damaged wheat detection and classification using spectrograms

被引:0
|
作者
Lin, Hao [1 ]
Guo, Min [1 ]
Ma, Miao [1 ]
机构
[1] Shaanxi Normal Univ, Sch Comp Sci, Key Lab Modern Teaching Technol, Minist Educ, Xian, Peoples R China
基金
中国国家自然科学基金;
关键词
neural architecture search; auto machine learning; wheat kernels; classification; spectrogram;
D O I
10.1117/1.JEI.33.5.053063
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Grain is one of the basic human necessities, and its quality and safety directly impact human dietary health. Various issues occur during grain storage, primarily mold and pest infestation. With the development of artificial intelligence, increasingly, more technologies are applied to grain detection and classification. Transformer-based models are becoming popular in grain detection. Although transformer models exhibit excellent performance, they are often large and cumbersome, limiting practical applications. We propose a framework named KD-ASF based on intermediate layer knowledge distillation and one-shot neural architecture search, to optimize the hyperparameters of vision transformer (ViT) for detecting and classifying molded wheat kernels (MDK), Insect-Damaged wheat kernels (IDK), and undamaged wheat kernels (UDK). In KD-ASF, we use the ViT model as our teacher network. Next, we design a search space containing adjustable hyperparameters of transformer building blocks. The super-network stacks maximum transformer building blocks and is trained under the guidance of the teacher network. Subsequently, the trained super-network undergoes evolutionary search, and the resulting networks are used for classifying different wheat kernels. We conducted experiments using a five-fold cross-validation approach and obtained an F1 score of 97.13%, and the last model parameter size is only 5.94M. The results demonstrate that this method not only outperforms the majority of neural networks in terms of performance but also has a significantly smaller model size than most network models. Its lightweight nature facilitates easy deployment and application. These findings indicate that the structure of KD-ASF is feasible and effective. (c) 2024 SPIE and IS&T
引用
收藏
页数:16
相关论文
共 50 条
  • [41] ViT4Mal: Lightweight Vision Transformer for Malware Detection on Edge Devices
    Ravi, Akshara
    Chaturvedi, Vivek
    Shafique, Muhammad
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (05)
  • [42] Acoustic Scene Classification Using Spectrograms
    Felipe, Gustavo Zanoni
    da Costa, Yandre Maldonado e Gomes
    Helal, Lucas Georges
    2017 36TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2017,
  • [43] Bird Species Classification Using Spectrograms
    Lucio, Diego Rafael
    Maldonado, Yandre
    da Costa, Gomes
    2015 XLI LATIN AMERICAN COMPUTING CONFERENCE (CLEI), 2015, : 335 - 345
  • [44] Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
    Gu, Quan
    Luan, Hongkang
    Huang, Kaixuan
    Sun, Yubao
    ELECTRONICS, 2024, 13 (05)
  • [45] Privacy-Preserving Image Classification Using Vision Transformer
    Qi, Zheng
    MaungMaung, AprilPyone
    Kinoshita, Yuma
    Kiya, Hitoshi
    2022 30TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2022), 2022, : 543 - 547
  • [46] Recycling Waste Classification Using Vision Transformer on Portable Device
    Huang, Kai
    Lei, Huan
    Jiao, Zeyu
    Zhong, Zhenyu
    SUSTAINABILITY, 2021, 13 (21)
  • [47] Efficient deepfake detection using shallow vision transformer
    Shaheen Usmani
    Sunil Kumar
    Debanjan Sadhya
    Multimedia Tools and Applications, 2024, 83 : 12339 - 12362
  • [48] Fire detection using vision transformer on power plant
    Zhang, Kaidi
    Wang, Binjun
    Tong, Xin
    Liu, Keke
    ENERGY REPORTS, 2022, 8 : 657 - 664
  • [49] Deepfake Image Detection using Vision Transformer Models
    Ghita, Bogdan
    Kuzminykh, Ievgeniia
    Usama, Abubakar
    Bakhshi, Taimur
    Marchang, Jims
    2024 IEEE INTERNATIONAL BLACK SEA CONFERENCE ON COMMUNICATIONS AND NETWORKING, BLACKSEACOM 2024, 2024, : 332 - 335
  • [50] Efficient deepfake detection using shallow vision transformer
    Usmani, Shaheen
    Kumar, Sunil
    Sadhya, Debanjan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 12339 - 12362