Attention-aware Cross-modal Cross-level Fusion Network for RGB-D Salient Object Detection

被引:0
|
作者
Chen, Hao [1 ]
Li, You-Fu [1 ,2 ]
Su, Dan [1 ]
机构
[1] City Univ Hong Kong, Dept Mech Engn, Kowloon, Hong Kong, Peoples R China
[2] City Univ Hong Kong, Shenzhen Res Inst, Hong Kong, Peoples R China
关键词
MODEL;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Convolutional neural networks have achieved wide success in RGB saliency detection. Recently, the advent of RGB-D sensors such as Kinect provide additional geometric saliency cues. However, the key challenge for RGB-D salient object detection that how to fuse RGB and depth information sufficiently is still under-studied. Traditional works mainly follow the two-stream architecture and combine RGB and depth features/decisions in an early or late point. The multi-modal fusion stage is performed by directly concatenating the features from two modalities without selection. In this work, we address this question by proposing a novel network with a distinguished insight: A selection module is significantly helpful for more informative and sufficient cross-modal cross-level combination. To this end, we introduce a top-down RGB-D fusion network which integrates an attention-aware cross-modal cross-level fusion block in each level to select discriminative features from each level and each modality. Extensive experiments on public datasets show that the proposed network is able to solve the key problems in RGB-D fusion and achieves state-of-the-art performance on RGB-D salient object detection.
引用
收藏
页码:6821 / 6826
页数:6
相关论文
共 50 条
  • [31] PATNet: Patch-to-pixel attention-aware transformer network for RGB-D and RGB-T salient object detection
    Jiang, Mingfeng
    Ma, Jianhua
    Chen, Jiatong
    Wang, Yaming
    Fang, Xian
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 291
  • [32] Cross-Modal Adaptation for RGB-D Detection
    Hoffman, Judy
    Gupta, Saurabh
    Leong, Jian
    Guadarrama, Sergio
    Darrell, Trevor
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2016, : 5032 - 5039
  • [33] Multi-scale Cross-Modal Transformer Network for RGB-D Object Detection
    Xiao, Zhibin
    Xie, Pengwei
    Wang, Guijin
    [J]. MULTIMEDIA MODELING (MMM 2022), PT I, 2022, 13141 : 352 - 363
  • [34] Cross-Modal Attentional Context Learning for RGB-D Object Detection
    Li, Guanbin
    Gan, Yukang
    Wu, Hejun
    Xiao, Nong
    Lin, Liang
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (04) : 1591 - 1601
  • [35] RGB-D salient object detection via cross-modal joint feature extraction and low-bound fusion loss
    Zhu, Xinxin
    Li, Yi
    Fu, Huazhu
    Fan, Xiaoting
    Shi, Yanan
    Lei, Jianjun
    [J]. NEUROCOMPUTING, 2021, 453 : 623 - 635
  • [36] Cross-Modal Adaptive Interaction Network for RGB-D Saliency Detection
    Du, Qinsheng
    Bian, Yingxu
    Wu, Jianyu
    Zhang, Shiyan
    Zhao, Jian
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [37] CCANet: A Collaborative Cross-Modal Attention Network for RGB-D Crowd Counting
    Liu, Yanbo
    Cao, Guo
    Shi, Boshan
    Hu, Yingxiang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 154 - 165
  • [38] Asymmetric cross-modal activation network for RGB-T salient object detection
    Xu, Chang
    Li, Qingwu
    Zhou, Qingkai
    Jiang, Xiongbiao
    Yu, Dabing
    Zhou, Yaqin
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 258
  • [39] Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection
    Chen, Hao
    Li, Youfu
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3051 - 3060
  • [40] An improved dense-to-sparse cross-modal fusion network for 3D object detection in RGB-D images
    Yan Chen
    Jianjun Ni
    Guangyi Tang
    Weidong Cao
    Simon X. Yang
    [J]. Multimedia Tools and Applications, 2024, 83 : 12159 - 12184