CANet: Co-attention network for RGB-D semantic segmentation

被引:75
|
作者
Zhou, Hao [1 ,3 ,4 ]
Qi, Lu [2 ]
Huang, Hai [1 ]
Yang, Xu [3 ,4 ]
Wan, Zhaoliang [1 ]
Wen, Xianglong [5 ]
机构
[1] Harbin Engn Univ, Natl Key Lab Sci & Technol Underwater Vehicle, Harbin, Peoples R China
[2] Chinese Univ Hong Kong, Hong Kong, Peoples R China
[3] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing, Peoples R China
[4] Univ Chinese Acad Sci, Beijing, Peoples R China
[5] Jihua Lab, Foshan, Peoples R China
基金
国家重点研发计划;
关键词
RGB-D; Multi -modal fusion; Co-attention; Semantic segmentation; FEATURES;
D O I
10.1016/j.patcog.2021.108468
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Incorporating the depth (D) information to RGB images has proven the effectiveness and robustness in semantic segmentation. However, the fusion between them is not trivial due to their inherent physical meaning discrepancy, in which RGB represents RGB information but D depth information. In this paper, we propose a co-attention network (CANet) to build sound interaction between RGB and depth features. The key part in the CANet is the co-attention fusion part. It includes three modules. Specifically, the po-sition and channel co-attention fusion modules adaptively fuse RGB and depth features in spatial and channel dimensions. An additional fusion co-attention module further integrates the outputs of the posi-tion and channel co-attention fusion modules to obtain a more representative feature which is used for the final semantic segmentation. Extensive experiments witness the effectiveness of the CANet in fus-ing RGB and depth features, achieving state-of-the-art performance on two challenging RGB-D semantic segmentation datasets, i.e., NYUDv2 and SUN-RGBD. (c) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] TSNet: Three-Stream Self-Attention Network for RGB-D Indoor Semantic Segmentation
    Zhou, Wujie
    Yuan, Jianzhong
    Lei, Jingsheng
    Luo, Ting
    IEEE INTELLIGENT SYSTEMS, 2021, 36 (04) : 73 - 78
  • [22] GANet: geometry-aware network for RGB-D semantic segmentation
    Tian, Chunqi
    Xu, Weirong
    Bai, Lizhi
    Yang, Jun
    Xu, Yanjun
    APPLIED INTELLIGENCE, 2025, 55 (06)
  • [23] DDNet: Depth Dominant Network for Semantic Segmentation of RGB-D Images
    Rong, Peizhi
    SENSORS, 2024, 24 (21)
  • [24] Intra-inter Modal Attention Blocks for RGB-D Semantic Segmentation
    Choi, Soyun
    Zhang, Youjia
    Hong, Sungeun
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 217 - 225
  • [25] SPNet: An RGB-D Sequence Progressive Network for Road Semantic Segmentation
    Zhou, Zhi
    Zhang, Yuhang
    Hua, Guoguang
    Long, Ruijing
    Tian, Shishun
    Zou, Wenbin
    2023 IEEE 25TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, MMSP, 2023,
  • [26] RGB-D Dual Modal Information Complementary Semantic Segmentation Network
    Wang L.
    Gu N.
    Xin J.
    Wang S.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (10): : 1489 - 1499
  • [27] RGB-D indoor semantic segmentation network based on wavelet transform
    Runze Fan
    Yuhong Liu
    Shiyi Jiang
    Rongfen Zhang
    Evolving Systems, 2023, 14 : 981 - 991
  • [28] RGB-D indoor semantic segmentation network based on wavelet transform
    Fan, Runze
    Liu, Yuhong
    Jiang, Shiyi
    Zhang, Rongfen
    EVOLVING SYSTEMS, 2023, 14 (06) : 981 - 991
  • [29] Zig-Zag Network for Semantic Segmentation of RGB-D Images
    Lin, Di
    Huang, Hui
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) : 2642 - 2655
  • [30] SCN: Switchable Context Network for Semantic Segmentation of RGB-D Images
    Lin, Di
    Zhang, Ruimao
    Ji, Yuanfeng
    Li, Ping
    Huang, Hui
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (03) : 1120 - 1131