Three-Stream Attention-Aware Network for RGB-D Salient Object Detection

被引：228

作者：

Chen, Hao ^{[1
]}

Li, Youfu ^{[1
]}

机构：

[1] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2019年 / 28卷 / 06期

关键词：

Three-stream; RGB-D; saliency detection; cross-modal crass-level attention; FUSION;

D O I：

10.1109/TIP.2019.2891104

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Previous RGB-D fusion systems based on convolutional neural networks typically employ a two-stream architecture, in which RGB and depth inputs are learned independently. The multi-modal fusion stage is typically performed by concatenating the deep features from each stream in the inference process. The traditional two-stream architecture might experience insufficient multi-modal fusion due to two following limitations: 1) the cross-modal complementarity is rarely studied in the bottom-up path, wherein we believe the cross-modal complements can be combined to learn new discriminative features to enlarge the RGB-D representation community and 2) the cross-modal channels are typically combined by undifferentiated concatenation, which appears ambiguous to selecting cross-modal complementary features. In this paper, we address these two limitations by proposing a novel three-stream attention-aware multi-modal fusion network. In the proposed architecture, a cross-modal distillation stream, accompanying the RGB-specific and depth-specific streams, is introduced to extract new RCB-D features in each level in the bottom-up path. Furthermore, the channel-wise attention mechanism is innovatively introduced to the cross-modal cross-level fusion problem to adaptively select complementary feature maps from each modality in each level. Extensive experiments report the effectiveness of the proposed architecture and the significant improvement over the state-ofthe-art RGB-D salient object detection methods.

引用

页码：2825 / 2835

页数：11

共 50 条

[31] Salient object detection for RGB-D image by single stream recurrent convolution neural network
Liu, Zhengyi
Shi, Song
Duan, Quntao
Zhang, Wei
Zhao, Peng
[J]. NEUROCOMPUTING, 2019, 363 : 46 - 57
[32] RGB-D salient object detection: A survey
Tao Zhou
Deng-Ping Fan
Ming-Ming Cheng
Jianbing Shen
Ling Shao
[J]. Computational Visual Media, 2021, 7 : 37 - 69
[33] RGB-D salient object detection: A survey
Tao Zhou
Deng-Ping Fan
Ming-Ming Cheng
Jianbing Shen
Ling Shao
[J]. Computational Visual Media, 2021, 7 (01) : 37 - 69
[34] RGB-D salient object detection: A survey
Zhou, Tao
Fan, Deng-Ping
Cheng, Ming-Ming
Shen, Jianbing
Shao, Ling
[J]. COMPUTATIONAL VISUAL MEDIA, 2021, 7 (01) : 37 - 69
[35] Salient Object Detection in RGB-D Videos
Mou, Ao
Lu, Yukang
He, Jiahao
Min, Dingyao
Fu, Keren
Zhao, Qijun
[J]. IEEE Transactions on Image Processing, 2024, 33 : 6660 - 6675
[36] Calibrated RGB-D Salient Object Detection
Ji, Wei
Li, Jingjing
Yu, Shuang
Zhang, Miao
Piao, Yongri
Yao, Shunyu
Bi, Qi
Ma, Kai
Zheng, Yefeng
Lu, Huchuan
Cheng, Li
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9466 - 9476
[37] Bidirectional feature learning network for RGB-D salient object detection
Niu, Ye
Zhou, Sanping
Dong, Yonghao
Wang, Le
Wang, Jinjun
Zheng, Nanning
[J]. PATTERN RECOGNITION, 2024, 150
[38] Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection
Li, Gongyang
Liu, Zhi
Chen, Minyu
Bai, Zhen
Lin, Weisi
Ling, Haibin
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3528 - 3542
[39] Feature Calibrating and Fusing Network for RGB-D Salient Object Detection
Zhang, Qiang
Qin, Qi
Yang, Yang
Jiao, Qiang
Han, Jungong
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1493 - 1507
[40] Triple-Complementary Network for RGB-D Salient Object Detection
Huang, Rui
Xing, Yan
Zou, Yaobin
[J]. IEEE SIGNAL PROCESSING LETTERS, 2020, 27 (27) : 775 - 779

← 1 2 3 4 5 →