Visual large language model for wheat disease diagnosis in the wild

被引:0
|
作者
Zhang, Kunpeng [1 ,2 ]
Ma, Li [1 ]
Cui, Beibei [1 ]
Li, Xin [1 ]
Zhang, Boqiang [3 ]
Xie, Na [4 ]
机构
[1] Henan Univ Technol, Coll Elect Engn, Zhengzhou 450001, Peoples R China
[2] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[3] Henan Univ Technol, Coll Mech Engn, Zhengzhou 450001, Peoples R China
[4] Cent Univ Finance & Econ, Sch Management Sci & Engn, Beijing 100081, Peoples R China
关键词
Plant disease; Wheat disease diagnosis; Wheat disease classification; Large language model; Explainable AI;
D O I
10.1016/j.compag.2024.109587
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Early detection of symptoms in wheat plants is crucial for mitigating disease effects and preventing their spread. Prompt phytosanitary treatment minimizes yield losses and enhances treatment efficacy. In recent years, numerous image analysis-based methodologies for automatic disease identification have been developed, with Convolutional Neural Networks (CNNs) achieving notable success in visual classification tasks. The existing methods often lack the necessary intelligence and reasoning for real-world applications. This study introduces an advanced wheat disease diagnosis approach using a Visual Language Model (VLM), named the Wheat Disease Language Model (WDLM). The WDLM first leverages the modified Segment Anything Model (SAM) to isolate key wheat features from complex wild environments. To enhance the logical reasoning abilities, the WDLM integrates a reasoning chain to generate clear, reasoned explanations for its diagnosis. By employing dedicated prompt engineering, this study establishes the Wheat Disease Semantic Dataset (WDSD) to fine-tune the VLM. The WDSD, which includes a diverse set of wheat images from various sources, bridges the gap between advanced VLM technology and wheat pathology. Tailored with task-specific data, the WDLM demonstrates superior intelligence by providing accurate classification of wheat diseases and suggesting potential treatment options. Compared to CNN-based models, Transformer-based models, and other VLMs, the WDLM shows improved performance in various scenarios. Integrated with mobile applications, the WDLM approach is readily applicable in the field, representing a promising advancement in the intelligent diagnosis of wheat diseases.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Exploring the Impact of Large Language Models on Disease Diagnosis
    Almubark, Ibrahim
    IEEE ACCESS, 2025, 13 : 8225 - 8238
  • [2] Application of large language models in disease diagnosis and treatment
    Yang Xintian
    Li Tongxin
    Su Qin
    Liu Yaling
    Kang Chenxi
    Lyu Yong
    Zhao Lina
    Nie Yongzhan
    Pan Yanglin
    中华医学杂志英文版, 2025, 138 (02)
  • [3] Application of large language models in disease diagnosis and treatment
    Yang, Xintian
    Li, Tongxin
    Su, Qin
    Liu, Yaling
    Kang, Chenxi
    Lyu, Yong
    Zhao, Lina
    Nie, Yongzhan
    Pan, Yanglin
    CHINESE MEDICAL JOURNAL, 2025, 138 (02) : 130 - 142
  • [4] Towards Efficient Compound Large Language Model System Serving in the Wild
    Zhu, Yifei
    Zhu, Botao
    Chen, Chen
    Fan, Xiaoyi
    2024 IEEE/ACM 32ND INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE, IWQOS, 2024,
  • [5] A visual-language foundation model for disease diagnosis and doctor-patient co-decision
    Yao, Yuanqi
    Jiang, Zehua
    Guan, Zhouyu
    Luxue, Yilun
    Lee, Seungmin
    Chen, Xiang
    Yang, Haodong
    Qin, Yiming
    VISUAL COMPUTER, 2025,
  • [6] PneumoLLM: Harnessing the power of large language model for pneumoconiosis diagnosis
    Song, Meiyue
    Wang, Jiarui
    Yu, Zhihua
    Wang, Jiaxin
    Yang, Le
    Lu, Yuting
    Li, Baicun
    Wang, Xue
    Wang, Xiaoxu
    Huang, Qinghua
    Li, Zhijun
    Kanellakis, Nikolaos I.
    Liu, Jiangfeng
    Wang, Jing
    Wang, Binglu
    Yang, Juntao
    MEDICAL IMAGE ANALYSIS, 2024, 97
  • [7] A generalist medical language model for disease diagnosis assistance
    Liu, Xiaohong
    Liu, Hao
    Yang, Guoxing
    Jiang, Zeyu
    Cui, Shuguang
    Zhang, Zhaoze
    Wang, Huan
    Tao, Liyuan
    Sun, Yongchang
    Song, Zhu
    Hong, Tianpei
    Yang, Jin
    Gao, Tianrun
    Zhang, Jiangjiang
    Li, Xiaohu
    Zhang, Jing
    Sang, Ye
    Yang, Zhao
    Xue, Kanmin
    Wu, Song
    Zhang, Ping
    Yang, Jian
    Song, Chunli
    Wang, Guangyu
    NATURE MEDICINE, 2025, : 932 - 942
  • [8] PromptChainer: Chaining Large Language Model Prompts through Visual Programming
    Wu, Tongshuang
    Jiang, Ellen
    Donsbach, Aaron
    Gray, Jeff
    Molina, Alejandra
    Terry, Michael
    Cai, Carrie J.
    EXTENDED ABSTRACTS OF THE 2022 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS, CHI 2022, 2022,
  • [9] A Framework for Agricultural Intelligent Analysis Based on a Visual Language Large Model
    Yu, Piaofang
    Lin, Bo
    APPLIED SCIENCES-BASEL, 2024, 14 (18):
  • [10] EarthMarker: A Visual Prompting Multimodal Large Language Model for Remote Sensing
    Zhang, Wei
    Cai, Miaoxin
    Zhang, Tong
    Zhuang, Yin
    Li, Jun
    Mao, Xuerui
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63