Standardizing and Centralizing Datasets for Efficient Training of Agricultural Deep Learning Models

被引:4
|
作者
Joshi, Amogh [1 ,2 ,3 ]
Guevara, Dario [1 ,2 ,3 ]
Earles, Mason [1 ,2 ,3 ]
机构
[1] Univ Calif Davis, Dept Viticulture & Enol, Davis, CA 95616 USA
[2] Univ Calif Davis, Dept Biol & Agr Engn, Davis, CA 95616 USA
[3] Univ Calif Davis, AI Inst Next Generat Food Syst AIFS, Davis, CA 95616 USA
来源
PLANT PHENOMICS | 2023年 / 5卷
关键词
57;
D O I
10.34133/plantphenomics.0084
中图分类号
S3 [农学(农艺学)];
学科分类号
0901 ;
摘要
In recent years, deep learning models have become the standard for agricultural computer vision. Such models are typically fine-tuned to agricultural tasks using model weights that were originally fit to more general, non-agricultural datasets. This lack of agriculture-specific fine-tuning potentially increases training time and resource use, and decreases model performance, leading to an overall decrease in data efficiency. To overcome this limitation, we collect a wide range of existing public datasets for 3 distinct tasks, standardize them, and construct standard training and evaluation pipelines, providing us with a set of benchmarks and pretrained models. We then conduct a number of experiments using methods that are commonly used in deep learning tasks but unexplored in their domain-specific applications for agriculture. Our experiments guide us in developing a number of approaches to improve data efficiency when training agricultural deep learning models, without large-scale modifications to existing pipelines. Our results demonstrate that even slight training modifications, such as using agricultural pretrained model weights, or adopting specific spatial augmentations into data processing pipelines, can considerably boost model performance and result in shorter convergence time, saving training resources. Furthermore, we find that even models trained on low-quality annotations can produce comparable levels of performance to their high-quality equivalents, suggesting that datasets with poor annotations can still be used for training, expanding the pool of currently available datasets. Our methods are broadly applicable throughout agricultural deep learning and present high potential for substantial data efficiency improvements.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Deep Learning on Small Datasets without Pre-Training using Cosine Loss
    Barz, Bjoern
    Denzler, Joachim
    [J]. 2020 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2020, : 1360 - 1369
  • [42] Wavelet decomposition facilitates training on small datasets for medical image classification by deep learning
    Axel H. Masquelin
    Nicholas Cheney
    C. Matthew Kinsey
    Jason H. T. Bates
    [J]. Histochemistry and Cell Biology, 2021, 155 : 309 - 317
  • [43] Effect of Training and Test Datasets on Image Restoration and Super-Resolution by Deep Learning
    Kirmemis, Ogun
    Tekalp, A. Murat
    [J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 514 - 518
  • [44] Special Datasets for Training Learning Algorithms
    Rao, Raghuveer M.
    Ladas, Andrew
    [J]. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS II, 2020, 11413
  • [45] A Data-Efficient Training Method for Deep Reinforcement Learning
    Feng, Wenhui
    Han, Chongzhao
    Lian, Feng
    Liu, Xia
    [J]. ELECTRONICS, 2022, 11 (24)
  • [46] Efficient Generation of Image Chips for Training Deep Learning Algorithms
    Han, Sanghui
    Fafard, Alex
    Kerekes, John
    Gartley, Michael
    Ientilucci, Emmett
    Savakis, Andreas
    Law, Charles
    Parhan, Jason
    Turek, Matt
    Fieldhouse, Keith
    Rovito, Todd
    [J]. AUTOMATIC TARGET RECOGNITION XXVII, 2017, 10202
  • [47] Elan: Towards Generic and Efficient Elastic Training for Deep Learning
    Xie, Lei
    Zhai, Jidong
    Wu, Baodong
    Wang, Yuanbo
    Zhang, Xingcheng
    Sun, Peng
    Yan, Shengen
    [J]. 2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2020, : 78 - 88
  • [48] Image Preprocessing for Efficient Training of YOLO Deep Learning Networks
    Jeong, Hyeok-June
    Park, Kyeong-Sik
    Ha, Young-Guk
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 635 - 637
  • [49] Nexus: Bringing Efficient and Scalable Training to Deep Learning Frameworks
    Wang, Yandong
    Zhang, Li
    Ren, Yufei
    Zhang, Wei
    [J]. 2017 IEEE 25TH INTERNATIONAL SYMPOSIUM ON MODELING, ANALYSIS, AND SIMULATION OF COMPUTER AND TELECOMMUNICATION SYSTEMS (MASCOTS), 2017, : 12 - 21
  • [50] A hybrid approach based on transfer and ensemble learning for improving performances of deep learning models on small datasets
    Gultekin, Tunc
    Ugur, Aybars
    [J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (07) : 3197 - 3211