Noise-tolerant RGB-D feature fusion network for outdoor fruit detection

被引：24

作者：

Sun, Qixin ^{[1
,2
]}

Chai, Xiujuan ^{[1
,2
]}

Zeng, Zhikang ^{[3
]}

Zhou, Guomin ^{[1
,2
]}

Sun, Tan ^{[1
,2
]}

机构：

[1] Chinese Acad Agr Sci, Agr Informat Inst, Beijing 100081, Peoples R China

[2] Minist Agr & Rural Affairs, Key Lab Agr Big Data, Beijing 100081, Peoples R China

[3] Guangxi Acad Agr Sci, Agr Sci & Technol Informat Res Inst, Nanning 530007, Peoples R China

来源：

COMPUTERS AND ELECTRONICS IN AGRICULTURE | 2022年 / 198卷

基金：

中国国家自然科学基金;

关键词：

Multi-modal; Feature fusion; Attention mechanism; Object detection; Convolutional neural network; APPLE DETECTION; COLOR; RED;

D O I：

10.1016/j.compag.2022.107034

中图分类号：

S [农业科学];

学科分类号：

09 ;

摘要：

In the process of farm automation, fruit detection is the basis and guarantee for yield prediction, automatic picking, and other orchard operations. RGB images can only obtain the two-dimensional information of the scene, which is not sufficient to effectively distinguish fruits that are dense growth and occlusion by branches and leaves. With the development of depth sensors, using RGB-D images with more complementary information can boost the performance of fruit detection. However, due to the nature of sensors and scene configurations, the quality of outdoor depth images is poor, posing a challenge when fusing RGB-D features. Therefore, this paper proposes an end-to-end RGB-D object detection network, termed as noise-tolerant feature fusion network (NTFFN), to utilize the outdoor multi-modal data properly and improve the detection accuracy. Specifically, the NTFFN first uses two structurally identical feature extractors to extract single-modal (color and depth) features, which is the base of the subsequent feature fusion. Then, to avoid introducing too much depth noise and focus the perception on the important part of the features, an attention-based fusion module is designed to adaptively fuse the multi-modal features. Finally, multi-scale features from the color images and the fusion modules are used to predict object position, which not only improves the network's ability to detect multi-scale objects but also further enhances the noise immunity of the network. In addition, this paper constructs an RGB-D citrus fruit dataset, which contributes to comprehensively evaluating the proposed network. Evaluation metrics on the dataset show that the NT-FFN achieves an AP(50) of 95.4% with a real-time speed, which outperforms single-modal methods, common multi-modal fusion strategies, and advanced multi-modal detection methods. The proposed NT-FFN also achieves excellent detection results in other fruit detection tasks, which verifies its generalization ability. This study provides the possibility and foundation for performing multi-modal information fusion in outdoor fruit detection.

引用

页数：13

共 50 条

[1] MFFNet: Multimodal feature fusion network for RGB-D transparent object detection
Zhu, Li
Li, Tuanjie
Ning, Yuming
Zhang, Yan
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2024, 21 (05):
[2] Discriminative feature fusion for RGB-D salient object detection
Chen, Zeyu
Zhu, Mingyu
Chen, Shuhan
Lu, Lu
Tang, Haonan
Hu, Xuelong
Ji, Chunfan
COMPUTERS & ELECTRICAL ENGINEERING, 2023, 106
[3] Adaptive fusion network for RGB-D salient object detection
Chen, Tianyou
Xiao, Jin
Hu, Xiaoguang
Zhang, Guofeng
Wang, Shaojie
NEUROCOMPUTING, 2023, 522 : 152 - 164
[4] Bifurcation Fusion Network for RGB-D Salient Object Detection
Zhao, Zhi-Hua
Chen, Li
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (12)
[5] Bidirectional feature learning network for RGB-D salient object detection
Niu, Ye
Zhou, Sanping
Dong, Yonghao
Wang, Le
Wang, Jinjun
Zheng, Nanning
PATTERN RECOGNITION, 2024, 150
[6] Feature Calibrating and Fusing Network for RGB-D Salient Object Detection
Zhang, Qiang
Qin, Qi
Yang, Yang
Jiao, Qiang
Han, Jungong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1493 - 1507
[7] Compensated Attention Feature Fusion and Hierarchical Multiplication Decoder Network for RGB-D Salient Object Detection
Zeng, Zhihong
Liu, Haijun
Chen, Fenglei
Tan, Xiaoheng
REMOTE SENSING, 2023, 15 (09)
[8] RAFNet: RGB-D attention feature fusion network for indoor semantic segmentation
Yan, Xingchao
Hou, Sujuan
Karim, Awudu
Jia, Weikuan
DISPLAYS, 2021, 70
[9] RGB-D Saliency Detection Based on Multi-Level Feature Fusion
Shi, Yue
Yu, Wanjun
Chen, Ying
Computer Engineering and Applications, 2023, 59 (07): : 207 - 213
[10] An adaptive guidance fusion network for RGB-D salient object detection
Sun, Haodong
Wang, Yu
Ma, Xinpeng
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (02) : 1683 - 1693

← 1 2 3 4 5 →