Three-Stream Attention-Aware Network for RGB-D Salient Object Detection

被引:228
|
作者
Chen, Hao [1 ]
Li, Youfu [1 ]
机构
[1] City Univ Hong Kong, Dept Mech Engn, Hong Kong, Peoples R China
关键词
Three-stream; RGB-D; saliency detection; cross-modal crass-level attention; FUSION;
D O I
10.1109/TIP.2019.2891104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous RGB-D fusion systems based on convolutional neural networks typically employ a two-stream architecture, in which RGB and depth inputs are learned independently. The multi-modal fusion stage is typically performed by concatenating the deep features from each stream in the inference process. The traditional two-stream architecture might experience insufficient multi-modal fusion due to two following limitations: 1) the cross-modal complementarity is rarely studied in the bottom-up path, wherein we believe the cross-modal complements can be combined to learn new discriminative features to enlarge the RGB-D representation community and 2) the cross-modal channels are typically combined by undifferentiated concatenation, which appears ambiguous to selecting cross-modal complementary features. In this paper, we address these two limitations by proposing a novel three-stream attention-aware multi-modal fusion network. In the proposed architecture, a cross-modal distillation stream, accompanying the RGB-specific and depth-specific streams, is introduced to extract new RCB-D features in each level in the bottom-up path. Furthermore, the channel-wise attention mechanism is innovatively introduced to the cross-modal cross-level fusion problem to adaptively select complementary feature maps from each modality in each level. Extensive experiments report the effectiveness of the proposed architecture and the significant improvement over the state-ofthe-art RGB-D salient object detection methods.
引用
收藏
页码:2825 / 2835
页数:11
相关论文
共 50 条
  • [31] RGB-D salient object detection: A survey
    Tao Zhou
    Deng-Ping Fan
    Ming-Ming Cheng
    Jianbing Shen
    Ling Shao
    [J]. Computational Visual Media, 2021, 7 (01) : 37 - 69
  • [32] RGB-D salient object detection: A survey
    Zhou, Tao
    Fan, Deng-Ping
    Cheng, Ming-Ming
    Shen, Jianbing
    Shao, Ling
    [J]. COMPUTATIONAL VISUAL MEDIA, 2021, 7 (01) : 37 - 69
  • [33] Calibrated RGB-D Salient Object Detection
    Ji, Wei
    Li, Jingjing
    Yu, Shuang
    Zhang, Miao
    Piao, Yongri
    Yao, Shunyu
    Bi, Qi
    Ma, Kai
    Zheng, Yefeng
    Lu, Huchuan
    Cheng, Li
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9466 - 9476
  • [34] Bidirectional feature learning network for RGB-D salient object detection
    Niu, Ye
    Zhou, Sanping
    Dong, Yonghao
    Wang, Le
    Wang, Jinjun
    Zheng, Nanning
    [J]. PATTERN RECOGNITION, 2024, 150
  • [35] Feature Calibrating and Fusing Network for RGB-D Salient Object Detection
    Zhang, Qiang
    Qin, Qi
    Yang, Yang
    Jiao, Qiang
    Han, Jungong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1493 - 1507
  • [36] Triple-Complementary Network for RGB-D Salient Object Detection
    Huang, Rui
    Xing, Yan
    Zou, Yaobin
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 775 - 779
  • [37] GroupTransNet: Group transformer network for RGB-D salient object detection
    Fang, Xian
    Jiang, Mingfeng
    Zhu, Jinchao
    Shao, Xiuli
    Wang, Hongpeng
    [J]. NEUROCOMPUTING, 2024, 594
  • [38] Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection
    Li, Gongyang
    Liu, Zhi
    Chen, Minyu
    Bai, Zhen
    Lin, Weisi
    Ling, Haibin
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3528 - 3542
  • [39] An adaptive guidance fusion network for RGB-D salient object detection
    Sun, Haodong
    Wang, Yu
    Ma, Xinpeng
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (02) : 1683 - 1693
  • [40] DMNet: Dynamic Memory Network for RGB-D Salient Object Detection
    Du, Haishun
    Zhang, Zhen
    Zhang, Minghao
    Qiao, Kangyi
    [J]. DIGITAL SIGNAL PROCESSING, 2023, 142