Synthetic data augmentation by diffusion probabilistic models to enhance weed recognition

被引:15
|
作者
Chen, Dong [1 ]
Qi, Xinda [1 ]
Zheng, Yu [1 ]
Lu, Yuzhen [2 ]
Huang, Yanbo [3 ]
Li, Zhaojian [4 ]
机构
[1] Michigan State Univ, Dept Elect & Comp Engn, E Lansing, MI 48824 USA
[2] Michigan State Univ, Dept Biosyst & Agr Engn, E Lansing, MI 48824 USA
[3] USDA ARS, Genet & Sustainable Agr Res Unit, Starkville, MS 39762 USA
[4] Michigan State Univ, Dept Mech Engn, E Lansing, MI 48824 USA
关键词
Computer vision; Data augmentation; Deep learning; Generative modeling; Precision weed management; Site-specific weed control; GENERATIVE ADVERSARIAL NETWORKS;
D O I
10.1016/j.compag.2023.108517
中图分类号
S [农业科学];
学科分类号
09 ;
摘要
Weed management plays an important role in crop yield and quality protection. Conventional weed control methods largely rely on intensive, blanket herbicide application, which incurs significant management costs and poses hazards to the environment and human health. Machine vision-based automated weeding has gained increasing attention for sustainable weed management through weed recognition and site-specific treatments. However, it remains a challenging task to reliably recognize weeds in variable field conditions, in part due to the difficulty curating large-scale, expert-labeled weed image datasets for supervised training of weed recognition algorithms. Data augmentation methods, including traditional geometric/color transformations and more advanced generative adversarial networks (GANs) can supplement data collection and labeling efforts by algorithmically expanding the scale of datasets. Recently, diffusion models have emerged in the field of image synthesis, providing a new means for augmenting image datasets to power machine vision systems. This study presents a novel investigation of the efficacy of diffusion models for generating weed images to enhance weed identification. Experiments on two public multi-class large weed datasets showed that diffusion models yielded the best trade-off between sample fidelity and diversity and obtained the highest Fre ' chet Inception Distance, compared to GANs (BigGAN, StyleGAN2, StyleGAN3). For instance, on a ten-class weed dataset (CottonWeedID10), the inclusion of synthetic weed images led to improvements by 1.17% (97.30% to 98.47), 1.21% (97.92% to 99.13%), and 2.30% (96.06% to 98.27%) in accuracy, precision, and recall, respectively, in weed classification by four deep learning models (i.e., VGG16, Inception-v3, Inception-v3, and ResNet50). Models trained using only 10% of real images with the remainder being synthetic data resulted in testing accuracy exceeding 94%.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Data augmentation of dynamic responses for structural health monitoring using denoising diffusion probabilistic models
    Zheng, Wenhao
    Li, Jun
    Hao, Hong
    ENGINEERING STRUCTURES, 2025, 328
  • [2] Solar synthetic imaging: Introducing denoising diffusion probabilistic models on SDO/AIA data
    Ramunno, F. P.
    Hackstein, S.
    Kinakh, V.
    Drozdova, M.
    Quetant, G.
    Csillaghy, A.
    Voloshynovskiy, S.
    ASTRONOMY & ASTROPHYSICS, 2024, 686
  • [3] Synthetic data generation by diffusion models
    Zhu, Jun
    NATIONAL SCIENCE REVIEW, 2024, 11 (08)
  • [4] Synthetic data generation by diffusion models
    Jun Zhu
    National Science Review, 2024, 11 (08) : 19 - 21
  • [5] Printed Ottoman text recognition using synthetic data and data augmentation
    Esma F. Bilgin Tasdemir
    International Journal on Document Analysis and Recognition (IJDAR), 2023, 26 : 273 - 287
  • [6] Printed Ottoman text recognition using synthetic data and data augmentation
    Tasdemir, Esma F. Bilgin F.
    INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION, 2023, 26 (03) : 273 - 287
  • [7] Distance-based Probabilistic Data Augmentation for Synthetic Minority Oversampling
    Goodman, Joel
    Sarkani, Sharham
    Mazzuchi, Thomas
    ACM/IMS Transactions on Data Science, 2021, 2 (04):
  • [8] Generation Of Synthetic Clutter Signals With Denoising Diffusion Probabilistic Models
    Sosedko, Taras Alexander
    Matthes, Dietmar
    Knott, Peter
    2024 INTERNATIONAL RADAR SYMPOSIUM, IRS 2024, 2024, : 30 - 32
  • [9] Synthetic ECG Signal Generation Using Probabilistic Diffusion Models
    Adib, Edmonmd
    Fernandez, Amanda S.
    Afghah, Fatemeh
    Prevost, John J.
    IEEE ACCESS, 2023, 11 : 75818 - 75828
  • [10] Synthetic data augmentation to enhance manual and automated defect detection in microelectronics
    Phoulady, Adrian
    Suleiman, Yara
    Choi, Hongbin
    Moore, Toni
    May, Nicholas
    Shahbazmohamadi, Sina
    Tavousi, Pouya
    MICROELECTRONICS RELIABILITY, 2023, 150