Apple fruit recognition in complex orchard environment based on improved YOLOv3

被引:0
|
作者
Zhao H. [1 ,2 ]
Qiao Y. [1 ]
Wang H. [1 ]
Yue Y. [1 ]
机构
[1] School of Electrical and Electronic Engineering, Tianjin University of Technology/Tianjin Key Laboratory of Complex System Control Theory and Application, Tianjin
[2] School of Engineering and Technology, Tianjin Agricultural University, Tianjin
关键词
Complex environment; Fruit identification; Harvester; Image processing; Object detection; YOLOv3;
D O I
10.11975/j.issn.1002-6819.2021.16.016
中图分类号
学科分类号
摘要
Automatic fruit recognition is one of the most important steps in fruit picking robots. In this study, a novel fruit recognition was proposed using improved YOLOv3, in order to identify the fruit quickly and accurately for the picking robot in the complex environment of the orchard (different light, occlusion, adhesion, large field of view, bagging, whether the fruit was mature or not). The specific procedure was as follows. 1) 4000 Apple images were captured under the complex environment via the orchard shooting and Internet collection. After labeling with LabelImg software, 3200 images were randomly selected as training set, 400 as verification set, and 400 as a test set. Mosaic data enhancement was also embedded in the model to improve the input images for the better generalization ability and robustness of model. 2) The network model was also improved. First, the residual module in the DarkNet53 network was combined with the CSPNet to reduce the amount of network calculation, while maintaining the detection accuracy. Second, the SPP module was added to the detection network of the original YOLOv3 model, further to fuse the global and local characteristics of fruits, in order to enhance the recall rate of model to the minimal fruit target. Third, a soft NMS was used to replace the traditional for better recognition ability of model, particularly for the overlapping fruits. Forth, the joint loss function using Focal and CIoU Loss was used to optimize the model for higher accuracy of recognition. 3) The model was finally trained in the deep learning environment of a server, thereby analyzing the training process after the dataset production and network construction. Optimal weights and parameters were achieved, according to the loss curve and various performance indexes of verification set. The results showed that the best performance was achieved, when training to the 109th epoch, where the obtained weight in this round was taken as the final model weight, precision was 94.1%, recall was 90.6%, F1 was 92.3%, mean average precision was 96.1%. Then, the test set is used to test the optimal model. The experimental results show that the Mean Average Precision value reached 96.3%, which is higher than 92.5% of the original model; F1 value reached 91.8%, higher than 88.0% of the original model; The average detection speed of video stream under GPU is 27.8 frame/s, which is higher than 22.2 frame/s of the original model. Furthermore, it was found that the best comprehensive performance was achieved to verify the effectiveness of the improvement compared with four advanced detection of Faster RCNN, RetinaNet, YOLOv5 and CenterNet. A comparison experiment was conducted under different fruit numbers and various lighting environments, further to verify the effectiveness and feasibility of the improved model. Correspondingly, the detection performance of model was significantly better for small target apples and severely occluded overlapping apples, compared with the improved YOLOv3 model, indicating the high effectiveness. In addition, the target detection using deep learning was robust to illumination, where the illumination change presented little impact on the detection performance. Consequently, the excellent detection, robustness and real-time performance can widely be expected to serve as an important support for accurate fruit recognition in complex environment. © 2021, Editorial Department of the Transactions of the Chinese Society of Agricultural Engineering. All right reserved.
引用
下载
收藏
页码:127 / 135
页数:8
相关论文
共 28 条
  • [11] Girshick R, Donahue J, Darrell T, Et al., Rich feature hierarchies for accurate object detection and semantic segmentation, Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580-587, (2014)
  • [12] Girshick R., Fast R-CNN, Proceedings of the IEEE international conference on computer vision, pp. 1440-1448, (2015)
  • [13] Ren S, He K, Girshick R, Et al., Faster R-CNN: Towards real-time object detection with region proposal networks, Advances in neural information processing systems, pp. 91-99, (2015)
  • [14] Liu Fang, Liu Yukun, Lin Sen, Et al., Rapid identification method of tomato fruit in complex environment based on improved YOLO, Transactions of the Chinese Society for Agricultural Machinery, 51, 6, pp. 229-237, (2020)
  • [15] Xiong Juntao, Zheng Zhenhui, Liang Jiaen, Et al., Orange recognition method in night environment based on improved YOLOv3 network, Transactions of the Chinese Society for Agricultural Machinery, 51, 4, pp. 199-206, (2020)
  • [16] Sun J, He X F, Ge X, Et al., Detection of key organs in tomato based on deep migration learning in a complex background, Agriculture, 8, 12, (2018)
  • [17] Zhang Lei, Jiang Junsheng, Li Xinyu, Et al., Experimental reserch on orchard fruit detetion based on fast convolutional neural network, Journal of Chinese Agricultural Mechanization, 41, 10, pp. 183-190, (2020)
  • [18] Liu W, Anguelov D, Erhan D, Et al., Ssd: Single shot multibox detector, European conference on computer vision, pp. 21-37, (2016)
  • [19] Redmon J, Divvala S, Girshick R, Et al., You only look once: Unified, real-time object detection, Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788, (2016)
  • [20] Tian Y N, Yang G D, Wang Z, Et al., Apple detection during different growth stages in orchards using the improved YOLOv3 model, Computers and Electronics in Agriculture, 157, pp. 417-426, (2019)