Visual large language model for wheat disease diagnosis in the wild

被引:0
|
作者
Zhang, Kunpeng [1 ,2 ]
Ma, Li [1 ]
Cui, Beibei [1 ]
Li, Xin [1 ]
Zhang, Boqiang [3 ]
Xie, Na [4 ]
机构
[1] Henan Univ Technol, Coll Elect Engn, Zhengzhou 450001, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[3] Henan Univ Technol, Coll Mech Engn, Zhengzhou 450001, Peoples R China
[4] Cent Univ Finance & Econ, Sch Management Sci & Engn, Beijing 100081, Peoples R China
关键词
Plant disease; Wheat disease diagnosis; Wheat disease classification; Large language model; Explainable AI;
D O I
10.1016/j.compag.2024.109587
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Early detection of symptoms in wheat plants is crucial for mitigating disease effects and preventing their spread. Prompt phytosanitary treatment minimizes yield losses and enhances treatment efficacy. In recent years, numerous image analysis-based methodologies for automatic disease identification have been developed, with Convolutional Neural Networks (CNNs) achieving notable success in visual classification tasks. The existing methods often lack the necessary intelligence and reasoning for real-world applications. This study introduces an advanced wheat disease diagnosis approach using a Visual Language Model (VLM), named the Wheat Disease Language Model (WDLM). The WDLM first leverages the modified Segment Anything Model (SAM) to isolate key wheat features from complex wild environments. To enhance the logical reasoning abilities, the WDLM integrates a reasoning chain to generate clear, reasoned explanations for its diagnosis. By employing dedicated prompt engineering, this study establishes the Wheat Disease Semantic Dataset (WDSD) to fine-tune the VLM. The WDSD, which includes a diverse set of wheat images from various sources, bridges the gap between advanced VLM technology and wheat pathology. Tailored with task-specific data, the WDLM demonstrates superior intelligence by providing accurate classification of wheat diseases and suggesting potential treatment options. Compared to CNN-based models, Transformer-based models, and other VLMs, the WDLM shows improved performance in various scenarios. Integrated with mobile applications, the WDLM approach is readily applicable in the field, representing a promising advancement in the intelligent diagnosis of wheat diseases.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
    Ye, Qilang
    Yu, Zitong
    Shao, Rui
    Xie, Xinyu
    Torr, Philip
    Cao, Xiaochun
    COMPUTER VISION - ECCV 2024, PT X, 2025, 15068 : 146 - 164
  • [42] Integrated visual and text-based analysis of ophthalmology clinical cases using a large language model
    Sorin, Vera
    Kapelushnik, Noa
    Hecht, Idan
    Zloto, Ofira
    Glicksberg, Benjamin S.
    Bufman, Hila
    Livne, Adva
    Barash, Yiftach
    Nadkarni, Girish N.
    Klang, Eyal
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [43] American Sign Language Fingerspelling Recognition in the Wild with Iterative Language Model Construction
    Kumwilaisak, Wuttipong
    Pannattee, Peerawat
    Hansakunbuntheung, Chatchawarn
    Thatphithakkul, Nattanun
    APSIPA TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING, 2022, 11 (01)
  • [44] Evolving code with a large language model
    Hemberg, Erik
    Moskal, Stephen
    O'Reilly, Una-May
    GENETIC PROGRAMMING AND EVOLVABLE MACHINES, 2024, 25 (02)
  • [45] Creating a large language model of a philosopher
    Schwitzgebel, Eric
    Schwitzgebel, David
    Strasser, Anna
    MIND & LANGUAGE, 2024, 39 (02) : 237 - 259
  • [46] Large language model is a flagship for Japan
    Kinoshita, Shotaro
    Yokoyama, Hiromi
    NATURE, 2023, 619 (7969) : 252 - 252
  • [47] Large language model for molecular chemistry
    Jie Pan
    Nature Computational Science, 2023, 3 : 5 - 5
  • [48] Large Margin Neural Language Model
    Huang, Jiaji
    Li, Yi
    Ping, Wei
    Huang, Liang
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1183 - 1191
  • [49] Large Language Model for Action Anticipation
    Li, Wei
    Luo, Dezhao
    Yang, Dongbao
    Wang, Weiping
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT III, 2024, 15018 : 207 - 222
  • [50] Large language model is a flagship for Japan
    Shotaro Kinoshita
    Hiromi Yokoyama
    Nature, 2023, 619 : 252 - 252