Improving real-time apple fruit detection: Multi-modal data and depth fusion with non-targeted background removal

被引:2
|
作者
Kaukab, Shaghaf [1 ]
Komal
Ghodki, Bhupendra M. [2 ]
Ray, Hena [3 ]
Kalnar, Yogesh B. [1 ]
Narsaiah, Kairam [4 ]
Brar, Jaskaran S. [1 ]
机构
[1] ICAR Res Complex, Cent Inst Postharvest Engn & Technol, Ludhiana 141004, India
[2] Indian Inst Technol Kharagpur, Agr & Food Engn Dept, Kharagpur 721302, India
[3] Ctr Dev Adv Comp, Kolkata 700091, India
[4] Indian Council Agr Res, Div Agr Engn, New Delhi 110012, India
关键词
Apple; Fruit detection; 3D localization; YOLO network; RGB-D images; Depth sensor; FASTER R-CNN; RGB; LOCALIZATION; RED;
D O I
10.1016/j.ecoinf.2024.102691
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
In automated fruit detection, RGB-Depth (RGB-D) images aid the detection model with additional depth information to enhance detection accuracy. However, outdoor depth images are usually of low quality, which limits the quality of depth data. In this study, an approach/technique for real -time apple fruit detection in a highdensity orchard environment by using multi -modal data is presented. Non-targeted background removal using the depth fusion (NBR-DF) method was developed to reduce the high noise condition of depth images. The noise occurred due to the uncontrolled lighting condition and holes with incomplete depth information in the depth images. NBR-DF technique follows three primary steps: pre-processing of depth images (point cloud generation), target object extraction, and background removal. The NBR-DF method serves as a pipeline to pre-process multimodal data to enhance features of depth images by filling holes to eliminate noise generated by depth holes. Further, the NBR-DF implemented with the YOLOv5 enhances the detection accuracy in dense orchard conditions by using multi -modal information as input. An attention-based depth fusion module that adaptively fuses the multi -modal features was developed. The integration of the depth-attention matrix involved pooling operations and sigmoid normalization, both of which are efficient methods for summarizing and normalizing depth information. The fusion module improves the identification of multiscale objects and strengthens the network's resistance to noise. The network then detects the fruit position using multiscale information from the RGB-D images in highly complex orchard environments. The detection results were compared and validated with other methods using different input modals and fusion strategies. The results showed that the detection accuracy using the NBR-DF approach achieved an average precision rate of 0.964 in real time. The performance comparison with other state -of -the -art methods and the model generalization study also establish that the present advanced depth-fusion attention mechanism and effective preprocessing steps in NBR-DF-YOLOv5 significantly surpass those in performance. In conclusion, the developed NBR-DF technique showed the potential to improve real -time apple fruit detection using multi -modal information.
引用
收藏
页数:13
相关论文
共 49 条
  • [41] Real-time fault detection and process control based on multi-channel sensor data fusion
    Zhijie Xia
    Feng Ye
    Min Dai
    Zhisheng Zhang
    The International Journal of Advanced Manufacturing Technology, 2021, 115 : 795 - 806
  • [42] Real-time multi-sensor data fusion for target detection, classification, tracking, counting, and range estimates
    Tsui, EK
    Thomas, R
    DETECTION AND REMEDIATION TECHNOLOGIES FOR MINES AND MINELIKE TARGETS IX, PTS 1 AND 2, 2004, 5415 : 811 - 821
  • [43] Improved information maximization based face and facial feature detection from real-time video and application in a multi-modal person identification system
    Xiong, ZY
    Chen, YQ
    Wang, R
    Huang, TS
    FOURTH IEEE INTERNATIONAL CONFERENCE ON MULTIMODAL INTERFACES, PROCEEDINGS, 2002, : 511 - 516
  • [44] Multi-Modal Sensing Data-Based Real-Time Path Loss Prediction for 6G UAV-to-Ground Communications
    Sun, Mingran
    Bai, Lu
    Huang, Ziwei
    Cheng, Xiang
    IEEE WIRELESS COMMUNICATIONS LETTERS, 2024, 13 (09) : 2462 - 2466
  • [45] Using an improved lightweight YOLOv8 model for real-time detection of multi-stage apple fruit in complex orchard environments
    Ma, Baoling
    Hua, Zhixin
    Wen, Yuchen
    Deng, Hongxing
    Zhao, Yongjie
    Pu, Liuru
    Song, Huaibo
    ARTIFICIAL INTELLIGENCE IN AGRICULTURE, 2024, 11 : 70 - 82
  • [46] Deep-learning model for real-time prediction of recurrence in early-stage non-small cell lung cancer: A multi-modal approach
    Jung, Hyun Ae
    Lee, Daehwan
    Park, Boram
    Lee, Kiwon
    Lee, Ho Yun
    Kim, Tae Jung
    Jeon, Yeong Jeong
    Lee, Junghee
    Park, Seong Yong
    Cho, Jong Ho
    Kim, Hong Kwan
    Choi, Yong Soo
    Park, Sehhoon
    Sun, Jong-Mu
    Lee, Se-Hoon
    Ahn, Jin Seok
    Ahn, Myung-Ju
    JOURNAL OF CLINICAL ONCOLOGY, 2024, 42 (16)
  • [47] Real-time snowy weather detection based on machine vision and vehicle kinematics: A non-parametric data fusion analysis protocol
    Ali, Elhashemi
    Khan, Md Nasim
    Ahmed, Mohamed M.
    JOURNAL OF SAFETY RESEARCH, 2022, 83 : 163 - 180
  • [48] Real-Time High-Performance Laser Welding Defect Detection by Combining ACGAN-Based Data Enhancement and Multi-Model Fusion
    Fan, Kui
    Peng, Peng
    Zhou, Hongping
    Wang, Lulu
    Guo, Zhongyi
    SENSORS, 2021, 21 (21)
  • [49] MANUALLY CONTROLLED TRANS-PERINEAL TARGETED BIOPSY WITH REAL-TIME FUSION IMAGE OF MULTI-PARAMETRIC MAGNETIC RESONANCE IMAGE AND TRANSRECTAL ULTRASOUND IMAGE FOR THE DIAGNOSIS OF PROSTATE SIGNIFICANT CANCER BASED ON PROSTATE IMAGE-REPORTING AND DATA SYSTEM VERSION 1
    Shoji, Sunao
    Higure, Taro
    Kawakami, Masayoshi
    Nakano, Mayura
    Terachi, Toshiro
    Uchida, Toyoaki
    JOURNAL OF UROLOGY, 2016, 195 (04): : E174 - E174