Improving real-time apple fruit detection: Multi-modal data and depth fusion with non-targeted background removal

被引:2
|
作者
Kaukab, Shaghaf [1 ]
Komal
Ghodki, Bhupendra M. [2 ]
Ray, Hena [3 ]
Kalnar, Yogesh B. [1 ]
Narsaiah, Kairam [4 ]
Brar, Jaskaran S. [1 ]
机构
[1] ICAR Res Complex, Cent Inst Postharvest Engn & Technol, Ludhiana 141004, India
[2] Indian Inst Technol Kharagpur, Agr & Food Engn Dept, Kharagpur 721302, India
[3] Ctr Dev Adv Comp, Kolkata 700091, India
[4] Indian Council Agr Res, Div Agr Engn, New Delhi 110012, India
关键词
Apple; Fruit detection; 3D localization; YOLO network; RGB-D images; Depth sensor; FASTER R-CNN; RGB; LOCALIZATION; RED;
D O I
10.1016/j.ecoinf.2024.102691
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
In automated fruit detection, RGB-Depth (RGB-D) images aid the detection model with additional depth information to enhance detection accuracy. However, outdoor depth images are usually of low quality, which limits the quality of depth data. In this study, an approach/technique for real -time apple fruit detection in a highdensity orchard environment by using multi -modal data is presented. Non-targeted background removal using the depth fusion (NBR-DF) method was developed to reduce the high noise condition of depth images. The noise occurred due to the uncontrolled lighting condition and holes with incomplete depth information in the depth images. NBR-DF technique follows three primary steps: pre-processing of depth images (point cloud generation), target object extraction, and background removal. The NBR-DF method serves as a pipeline to pre-process multimodal data to enhance features of depth images by filling holes to eliminate noise generated by depth holes. Further, the NBR-DF implemented with the YOLOv5 enhances the detection accuracy in dense orchard conditions by using multi -modal information as input. An attention-based depth fusion module that adaptively fuses the multi -modal features was developed. The integration of the depth-attention matrix involved pooling operations and sigmoid normalization, both of which are efficient methods for summarizing and normalizing depth information. The fusion module improves the identification of multiscale objects and strengthens the network's resistance to noise. The network then detects the fruit position using multiscale information from the RGB-D images in highly complex orchard environments. The detection results were compared and validated with other methods using different input modals and fusion strategies. The results showed that the detection accuracy using the NBR-DF approach achieved an average precision rate of 0.964 in real time. The performance comparison with other state -of -the -art methods and the model generalization study also establish that the present advanced depth-fusion attention mechanism and effective preprocessing steps in NBR-DF-YOLOv5 significantly surpass those in performance. In conclusion, the developed NBR-DF technique showed the potential to improve real -time apple fruit detection using multi -modal information.
引用
收藏
页数:13
相关论文
共 49 条
  • [31] MULTI-MODAL REAL-TIME THREE-DIMENSIONAL ULTRASOUND-BASED IMAGING FOR DETECTION OF PROSTATE CANCER IN MEN
    Chung, H.
    Sofroni, E.
    Papanicolau, N.
    Sugar, L.
    Morton, G.
    Yaffe, M.
    Czarnota, G.
    RADIOTHERAPY AND ONCOLOGY, 2010, 96 : S23 - S23
  • [32] Real-time in-situ defect detection via multi-modal classification of waveforms for wire arc additive manufacturing
    Clark, Benton
    Ibn Mohsin, Syed
    Poonawala, Hasan A.
    Badurdeen, Fazleena
    Brennan, Raymond
    Jawahir, I. S.
    JOURNAL OF INTELLIGENT MANUFACTURING, 2025,
  • [33] EpSMART: Epileptic Seizure Monitoring with Alerts in Real Time A Tablet-based Android Application for a Real-time Multi-modal Seizure Detection System
    Gouravajhala, Sai R.
    Wang, David
    Khuon, Lunal
    Bao, Forrest S.
    2012 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS (BIBMW), 2012,
  • [34] A deep-learning based system using multi-modal data for diagnosing gastric neoplasms in real-time (with video)
    Du, Hongliu
    Dong, Zehua
    Wu, Lianlian
    Li, Yanxia
    Liu, Jun
    Luo, Chaijie
    Zeng, Xiaoquan
    Deng, Yunchao
    Cheng, Du
    Diao, Wenxiu
    Zhu, Yijie
    Tao, Xiao
    Wang, Junxiao
    Zhang, Chenxia
    Yu, Honggang
    GASTRIC CANCER, 2023, 26 (02) : 275 - 285
  • [35] A deep-learning based system using multi-modal data for diagnosing gastric neoplasms in real-time (with video)
    Hongliu Du
    Zehua Dong
    Lianlian Wu
    Yanxia Li
    Jun Liu
    Chaijie Luo
    Xiaoquan Zeng
    Yunchao Deng
    Du Cheng
    Wenxiu Diao
    Yijie Zhu
    Xiao Tao
    Junxiao Wang
    Chenxia Zhang
    Honggang Yu
    Gastric Cancer, 2023, 26 : 275 - 285
  • [36] Research on robust real-time Detection Fusion Method of multi-source Measurement Data
    Zhang Xu
    Huang He
    Chen Qing-liang
    Liu Hong
    Wang Yan-ting
    Liu Min
    Jia Jiang
    2015 27TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2015, : 4147 - 4152
  • [37] Real-time face detection and phone-to-face distance measuring for speech recognition for multi-modal interface in mobile device
    Sung-Hoon Hong
    Jae-Won Lee
    Ramesh Kumar Lama
    Goo-Rak Kwon
    Multimedia Tools and Applications, 2016, 75 : 6717 - 6735
  • [38] Real-time face detection and phone-to-face distance measuring for speech recognition for multi-modal interface in mobile device
    Hong, Sung-Hoon
    Lee, Jae-Won
    Lama, Ramesh Kumar
    Kwon, Goo-Rak
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (12) : 6717 - 6735
  • [39] Real-time fault detection and process control based on multi-channel sensor data fusion
    Xia, Zhijie
    Ye, Feng
    Dai, Min
    Zhang, Zhisheng
    INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2021, 115 (03): : 795 - 806
  • [40] Real-Time Target Detection System for Intelligent Vehicles Based on Multi-Source Data Fusion
    Zou, Junyi
    Zheng, Hongyi
    Wang, Feng
    SENSORS, 2023, 23 (04)