High-Accuracy Tomato Leaf Disease Image-Text Retrieval Method Utilizing LAFANet

被引:1
|
作者
Xu, Jiaxin [1 ]
Zhou, Hongliang [1 ]
Hu, Yufan [1 ]
Xue, Yongfei [1 ]
Zhou, Guoxiong [1 ]
Li, Liujun [2 ]
Dai, Weisi [1 ]
Li, Jinyang [1 ]
机构
[1] Cent South Univ Forestry & Technol, Coll Comp & Informat Engn, Changsha 410004, Peoples R China
[2] Univ Idaho, Dept Soil & Water Syst, Moscow, ID 83844 USA
来源
PLANTS-BASEL | 2024年 / 13卷 / 09期
关键词
LAFANet; TLDITRD; LFA; FNE-ANS; AR; image-text retrieval; cross-modal; SYSTEM;
D O I
10.3390/plants13091176
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
Tomato leaf disease control in the field of smart agriculture urgently requires attention and reinforcement. This paper proposes a method called LAFANet for image-text retrieval, which integrates image and text information for joint analysis of multimodal data, helping agricultural practitioners to provide more comprehensive and in-depth diagnostic evidence to ensure the quality and yield of tomatoes. First, we focus on six common tomato leaf disease images and text descriptions, creating a Tomato Leaf Disease Image-Text Retrieval Dataset (TLDITRD), introducing image-text retrieval into the field of tomato leaf disease retrieval. Then, utilizing ViT and BERT models, we extract detailed image features and sequences of textual features, incorporating contextual information from image-text pairs. To address errors in image-text retrieval caused by complex backgrounds, we propose Learnable Fusion Attention (LFA) to amplify the fusion of textual and image features, thereby extracting substantial semantic insights from both modalities. To delve further into the semantic connections across various modalities, we propose a False Negative Elimination-Adversarial Negative Selection (FNE-ANS) approach. This method aims to identify adversarial negative instances that specifically target false negatives within the triplet function, thereby imposing constraints on the model. To bolster the model's capacity for generalization and precision, we propose Adversarial Regularization (AR). This approach involves incorporating adversarial perturbations during model training, thereby fortifying its resilience and adaptability to slight variations in input data. Experimental results show that, compared with existing ultramodern models, LAFANet outperformed existing models on TLDITRD dataset, with top1, top5, and top10 reaching 83.3% and 90.0%, and top1, top5, and top10 reaching 80.3%, 93.7%, and 96.3%. LAFANet offers fresh technical backing and algorithmic insights for the retrieval of tomato leaf disease through image-text correlation.
引用
收藏
页数:29
相关论文
共 39 条
  • [1] A Precise Framework for Rice Leaf Disease Image-Text Retrieval Using FHTW-Net
    Zhou, Hongliang
    Hu, Yufan
    Liu, Shuai
    Zhou, Guoxiong
    Xu, Jiaxin
    Chen, Aibin
    Wang, Yanfeng
    Li, Liujun
    Hu, Yahui
    PLANT PHENOMICS, 2024, 7
  • [2] An automatic image-text alignment method for large-scale web image retrieval
    Baopeng Zhang
    Yanyun Qu
    Jinye Peng
    Jianping Fan
    Multimedia Tools and Applications, 2017, 76 : 21401 - 21421
  • [3] An automatic image-text alignment method for large-scale web image retrieval
    Zhang, Baopeng
    Qu, Yanyun
    Peng, Jinye
    Fan, Jianping
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (20) : 21401 - 21421
  • [4] Transcending Fusion: A Multiscale Alignment Method for Remote Sensing Image-Text Retrieval
    Yang, Rui
    Wang, Shuang
    Han, Yingping
    Li, Yuanheng
    Zhao, Dong
    Quan, Dou
    Guo, Yanhe
    Jiao, Licheng
    Yang, Zhi
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [5] RICH: A rapid method for image-text cross-modal hash retrieval
    Li, Bo
    Yao, Dan
    Li, Zhixin
    DISPLAYS, 2023, 79
  • [6] A TEXTURE AND SALIENCY ENHANCED IMAGE LEARNING METHOD FOR CROSS-MODAL REMOTE SENSING IMAGE-TEXT RETRIEVAL
    Yang, Rui
    Zhang, Di
    Guo, YanHe
    Wang, Shuang
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 4895 - 4898
  • [7] Knowledge Decomposition and Replay: A Novel Cross-modal Image-text Retrieval Continual Learning Method
    Yang, Rui
    Wang, Shuang
    Zhang, Huan
    Xu, Siyuan
    Guo, YanHe
    Ye, Xiutiao
    Hou, Biao
    Jiao, Licheng
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 6510 - 6519
  • [8] A FAST AND ACCURATE METHOD FOR REMOTE SENSING IMAGE-TEXT RETRIEVAL BASED ON LARGE MODEL KNOWLEDGE DISTILLATION
    Liao, Yu
    Yang, Rui
    Xie, Tao
    Xing, Hantong
    Quan, Dou
    Wang, Shuang
    Hou, Biao
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5077 - 5080
  • [9] A Vegetable Leaf Disease Identification Model Based on Image-Text Cross-Modal Feature Fusion
    Feng, Xuguang
    Zhao, Chunjiang
    Wang, Chunshan
    Wu, Huarui
    Miao, Yisheng
    Zhang, Jingjian
    FRONTIERS IN PLANT SCIENCE, 2022, 13
  • [10] A Robust Method for High-Accuracy Microbe Identification Based on Image Processing
    Cao, Mingqi
    Liu, Yuncai
    PROCEEDINGS OF 2014 IEEE INTERNATIONAL CONFERENCE ON PROGRESS IN INFORMATICS AND COMPUTING (PIC), 2014, : 331 - 335