Visual large language model for wheat disease diagnosis in the wild

被引:0
|
作者
Zhang, Kunpeng [1 ,2 ]
Ma, Li [1 ]
Cui, Beibei [1 ]
Li, Xin [1 ]
Zhang, Boqiang [3 ]
Xie, Na [4 ]
机构
[1] Henan Univ Technol, Coll Elect Engn, Zhengzhou 450001, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[3] Henan Univ Technol, Coll Mech Engn, Zhengzhou 450001, Peoples R China
[4] Cent Univ Finance & Econ, Sch Management Sci & Engn, Beijing 100081, Peoples R China
关键词
Plant disease; Wheat disease diagnosis; Wheat disease classification; Large language model; Explainable AI;
D O I
10.1016/j.compag.2024.109587
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Early detection of symptoms in wheat plants is crucial for mitigating disease effects and preventing their spread. Prompt phytosanitary treatment minimizes yield losses and enhances treatment efficacy. In recent years, numerous image analysis-based methodologies for automatic disease identification have been developed, with Convolutional Neural Networks (CNNs) achieving notable success in visual classification tasks. The existing methods often lack the necessary intelligence and reasoning for real-world applications. This study introduces an advanced wheat disease diagnosis approach using a Visual Language Model (VLM), named the Wheat Disease Language Model (WDLM). The WDLM first leverages the modified Segment Anything Model (SAM) to isolate key wheat features from complex wild environments. To enhance the logical reasoning abilities, the WDLM integrates a reasoning chain to generate clear, reasoned explanations for its diagnosis. By employing dedicated prompt engineering, this study establishes the Wheat Disease Semantic Dataset (WDSD) to fine-tune the VLM. The WDSD, which includes a diverse set of wheat images from various sources, bridges the gap between advanced VLM technology and wheat pathology. Tailored with task-specific data, the WDLM demonstrates superior intelligence by providing accurate classification of wheat diseases and suggesting potential treatment options. Compared to CNN-based models, Transformer-based models, and other VLMs, the WDLM shows improved performance in various scenarios. Integrated with mobile applications, the WDLM approach is readily applicable in the field, representing a promising advancement in the intelligent diagnosis of wheat diseases.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] LLM-CDM: A Large Language Model Enhanced Cognitive Diagnosis for Intelligent Education
    Chen, Xin
    Zhang, Jin
    Zhou, Tong
    Zhang, Feng
    IEEE ACCESS, 2025, 13 : 47165 - 47180
  • [32] Visual Comparison of Language Model Adaptation
    Sevastjanova R.
    Cakmak E.
    Ravfogel S.
    Cotterell R.
    El-Assady M.
    IEEE Transactions on Visualization and Computer Graphics, 2023, 29 (01) : 1178 - 1188
  • [33] Integrating visual large language model and reasoning chain for driver behavior analysis and risk assessment
    Zhang, Kunpeng
    Wang, Shipu
    Jia, Ning
    Zhao, Liang
    Han, Chunyang
    Li, Li
    ACCIDENT ANALYSIS AND PREVENTION, 2024, 198
  • [34] Conversations on reasoning: Large language models in diagnosis
    Restrepo, Daniel
    Rodman, Adam
    Abdulnour, Raja-Elie
    JOURNAL OF HOSPITAL MEDICINE, 2024, 19 (08) : 731 - 735
  • [35] Potato disease detection and prevention using multimodal AI and large language model
    Zhu, Hongfei
    Shi, Weiming
    Guo, Xinyu
    Lyu, Shiting
    Yang, Ranbing
    Han, Zhongzhi
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 229
  • [36] iVoLVER: A Visual Language for Constructing Visualizations from In-the-Wild Data
    Nacenta, Miguel A.
    Gabriel Mendez, Gonzalo
    PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON INTERACTIVE SURFACES AND SPACES (ACM ISS 2017), 2017, : 438 - 441
  • [37] Evolving Interpretable Visual Classifiers with Large Language Models
    Chiquier, Mia
    Mall, Utkarsh
    Vondrick, Carl
    COMPUTER VISION - ECCV 2024, PT LXIV, 2025, 15122 : 183 - 201
  • [38] Harnessing Large Language Models for Rheumatic Disease Diagnosis: Advancing Hybrid Care and Task Shifting
    Lechner, Fabian
    Kuhn, Sebastian
    Knitza, Johannes
    INTERNATIONAL JOURNAL OF RHEUMATIC DISEASES, 2025, 28 (02)
  • [39] Enhancing Large Language Models with RAG for Visual Language Navigation in Continuous Environments
    Bao, Xiaoan
    Lv, Zhiqiang
    Wu, Biao
    ELECTRONICS, 2025, 14 (05):
  • [40] Joint Knowledge Graph and Large Language Model for Fault Diagnosis and Its Application in Aviation Assembly
    Liu, Peifeng
    Qian, Lu
    Zhao, Xingwei
    Tao, Bo
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (06) : 8160 - 8169