Wheat ear counting method in UAV images based on TPH-YOLO

被引:0
|
作者
Bao W. [1 ]
Xie W. [1 ]
Hu G. [1 ]
Yang X. [2 ]
Su B. [1 ]
机构
[1] National Engineering Research Center for Agro-Ecological Big Data Analysis & Application, Anhui University, Hefei
[2] Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei
关键词
attention mechanisms; image processing; transfer learning; transformer encoder; UAV; wheat ear counting; YOLOv5;
D O I
10.11975/j.issn.1002-6819.202210020
中图分类号
学科分类号
摘要
Optical sensors have been widely installed on unmanned aerial vehicle (UAV) to capture images of all kinds of crops in recent years. The economic and effective way can greatly contribute to yield prediction and field management in modern agriculture. However, the great challenge of wheat ear counting still remains in the dense distribution of wheat ears, the serious overlap phenomenon, and the complex background information in the images. In this study, a detection model of the wheat ear was designed to improve the accuracy of the wheat ear counting in the UAV images using the transformer prediction heads “you only look once” (TPH-YOLO). The UAV wheat ear images were also taken as the research object. Firstly, the Retinex algorithm was used to deal with the enhancement of the wheat ear images that collected by the UAV, in order to reduce the influence of the uneven illumination on the image quality. Secondly, the coordinate attention mechanism (CA) was added to the backbone network of YOLOv5. In this way, the improved model was utilized to refine the features after treatment. As a result, the TPH-YOLO network was focused mainly on the wheat ear information, at the same time to avoid the interference of some background factors, such as the wheat stalk, and the wheat leaf. Once more, the original prediction head in the YOLOv5 was converted into the transformer prediction head (TPH) in this case. Correspondingly, the improved prediction head was obtained for the prediction potential of multiple head attention mechanism, in order to accurately fix the position of the wheat ears in a high-density scene. In the end, the training strategy was adopted to improve the generalization ability and the detection accuracy of the TPH-YOLO network using transfer learning. The image dataset of the wheat ear that was collected in the field was used to pre-train the model, and then the wheat ear image dataset collected by the UAV was used to update and optimize the model parameters. A series of experiments were conducted on the wheat ear images collected by the UAV. The performance of the target detection model was evaluated by the three indicators: Precision, recall, and average precision (AP). The experimental results show that the precision, recall, and average precision (AP) of the improved model were 87.2%, 84.1%, and 88.8%, respectively. The average precision of the improved model was 4.1% higher than the original YOLOv5 one. The performance was also better than the SSD, Fast RCNN, CenterNet, and Yolov5 target detection models. In addition, Global Wheat Head Detection (GWHD) dataset was selected to carry out the comparative experiments on the different target detection models, due to the diverse and typical wheat samples from the GWHD dataset. Compared with the target detection models such as SSD, Faster-RCNN, CenterNet and YOLOv5, the average precision increased by 11.1, 5.4, 6.9 and 3.3 percentage points respectively. The comparative analysis of the detection further verified the reliability and effectiveness of the improved model. Consequently, the finding can also provide strong support for the wheat yield prediction. © 2023 Chinese Society of Agricultural Engineering. All rights reserved.
引用
收藏
页码:155 / 161
页数:6
相关论文
共 30 条
  • [1] LIU H, WANG Z, YU R, Et al., Optimal nitrogen input for higher efficiency and lower environmental impacts of winter wheat production in China, Agriculture Ecosystems & Environment, 224, pp. 1-11, (2016)
  • [2] XIONG H, CAO Z, LU H, Et al., TasselNetv2: In-field counting of wheat spikes with context-augmented local regression networks, Plant Methods, 15, 1, pp. 1-14, (2019)
  • [3] ZHOU C, LIANG D, YANG X, Et al., Recognition of wheat spike from field based phenotype platform using multi-sensor fusion and improved maximum entropy segmentation algorithms, Remote Sensing, 10, 2, pp. 246-270, (2018)
  • [4] FERNANDEZ-GALLEGO J A, KEFAUVER S C, GUTIERREZ N, Et al., Wheat ear counting in-field conditions: High throughput and low-cost approach using RGB images, Plant Methods, 14, 1, (2018)
  • [5] LIU Zhe, HUANG Wenzhun, WANG Liping, Field wheat ear counting automatically based on improved K-means clustering algorithm, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 35, 3, pp. 174-181, (2019)
  • [6] BAO Wenxia, ZHANG Xin, HU Gensheng, Et al., Estimation and counting of wheat ears density in field based on deep convolutional neural network, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 36, 21, pp. 186-194, (2020)
  • [7] SUN Jun, YANG Kaifeng, LUO Yuanqiu, Et al., Method for the multiscale perceptual counting of wheat ears based on UAV images, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 37, 23, pp. 136-144, (2021)
  • [8] LI Yunxia, MA Juncheng, LIU Hongjie, Et al., Field growth parameter estimation system of winter wheat using RGB digital images and deep learning, Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE), 37, 24, pp. 189-198, (2021)
  • [9] KHOROSHEVSKY F, KHOROSHEVSKY S, BAR-HILLEL A., Parts-per-object count in agricultural images: Solving phenotyping problems via a single deep neural network, Remote Sensing, 13, 13, (2021)
  • [10] WANG D, ZHANG D, YANG G, Et al., SSRNet: In-field counting wheat ears using multi-stage convolutional neural network, IEEE Transactions on Geoscience and Remote Sensing, 60, pp. 1-11, (2021)