CANet: Co-attention network for RGB-D semantic segmentation

被引:75
|
作者
Zhou, Hao [1 ,3 ,4 ]
Qi, Lu [2 ]
Huang, Hai [1 ]
Yang, Xu [3 ,4 ]
Wan, Zhaoliang [1 ]
Wen, Xianglong [5 ]
机构
[1] Harbin Engn Univ, Natl Key Lab Sci & Technol Underwater Vehicle, Harbin, Peoples R China
[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[3] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing, Peoples R China
[4] Univ Chinese Acad Sci, Beijing, Peoples R China
[5] Jihua Lab, Foshan, Peoples R China
基金
国家重点研发计划;
关键词
RGB-D; Multi -modal fusion; Co-attention; Semantic segmentation; FEATURES;
D O I
10.1016/j.patcog.2021.108468
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Incorporating the depth (D) information to RGB images has proven the effectiveness and robustness in semantic segmentation. However, the fusion between them is not trivial due to their inherent physical meaning discrepancy, in which RGB represents RGB information but D depth information. In this paper, we propose a co-attention network (CANet) to build sound interaction between RGB and depth features. The key part in the CANet is the co-attention fusion part. It includes three modules. Specifically, the po-sition and channel co-attention fusion modules adaptively fuse RGB and depth features in spatial and channel dimensions. An additional fusion co-attention module further integrates the outputs of the posi-tion and channel co-attention fusion modules to obtain a more representative feature which is used for the final semantic segmentation. Extensive experiments witness the effectiveness of the CANet in fus-ing RGB and depth features, achieving state-of-the-art performance on two challenging RGB-D semantic segmentation datasets, i.e., NYUDv2 and SUN-RGBD. (c) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Attention-based fusion network for RGB-D semantic segmentation
    Zhong, Li
    Guo, Chi
    Zhan, Jiao
    Deng, JingYi
    NEUROCOMPUTING, 2024, 608
  • [2] DCANet: Differential convolution attention network for RGB-D semantic segmentation
    Bai, Lizhi
    Yang, Jun
    Tian, Chunqi
    Sun, Yaoru
    Mao, Maoyu
    Xu, Yanjun
    Xu, Weirong
    PATTERN RECOGNITION, 2025, 162
  • [3] Cross-modal attention fusion network for RGB-D semantic segmentation
    Zhao, Qiankun
    Wan, Yingcai
    Xu, Jiqian
    Fang, Lijin
    NEUROCOMPUTING, 2023, 548
  • [4] CDMANet: central difference mutual attention network for RGB-D semantic segmentation
    Ge, Mengjiao
    Su, Wen
    Gao, Jinfeng
    Jia, Guoqiang
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [5] RAFNet: RGB-D attention feature fusion network for indoor semantic segmentation
    Yan, Xingchao
    Hou, Sujuan
    Karim, Awudu
    Jia, Weikuan
    DISPLAYS, 2021, 70
  • [6] Attention-Aware and Semantic-Aware Network for RGB-D Indoor Semantic Segmentation
    Duan L.-J.
    Sun Q.-C.
    Qiao Y.-H.
    Chen J.-C.
    Cui G.-Q.
    Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (02): : 275 - 291
  • [7] Automatic Network Architecture Search for RGB-D Semantic Segmentation
    Wang, Wenna
    Zhuo, Tao
    Zhang, Xiuwei
    Sun, Mingjun
    Yin, Hanlin
    Xing, Yinghui
    Zhang, Yanning
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3777 - 3786
  • [8] RGB-D SEMANTIC SEGMENTATION: A REVIEW
    Hu, Yaosi
    Chen, Zhenzhong
    Lin, Weiyao
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
  • [9] Semantic Progressive Guidance Network for RGB-D Mirror Segmentation
    Li, Chao
    Zhou, Wujie
    Zhou, Xi
    Yan, Weiqing
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2780 - 2784
  • [10] Cascaded Feature Network for Semantic Segmentation of RGB-D Images
    Lin, Di
    Chen, Guangyong
    Daniel Cohen-Or
    Heng, Pheng-Ann
    Huang, Hui
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1320 - 1328