DCANet: Differential convolution attention network for RGB-D semantic segmentation

被引:0
|
作者
Bai, Lizhi [1 ]
Yang, Jun [1 ]
Tian, Chunqi [1 ]
Sun, Yaoru [1 ]
Mao, Maoyu [1 ]
Xu, Yanjun [1 ]
Xu, Weirong [1 ]
机构
[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Semantic segmentation; RGB-D; Differential convolution; Attention; SALIENCY;
D O I
10.1016/j.patcog.2025.111379
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Combining RGB images and their corresponding depth maps in semantic segmentation has proven to be effective in recent years. However, existing RGB-D modal fusion methods either lack non-linear feature fusion abilities or treat both modal images equally, disregarding the intrinsic distribution gap and information loss. In this study, we have observed that depth maps are well-suited for providing fine-grained patterns of objects due to their local depth continuity, while RGB images effectively offer a global view. Based on this observation, we propose a novel module called the pixel Differential Convolution Attention (DCA) module, which takes into account geometric information and local-range correlations for depth data. Additionally, we extend the DCA module to create the Ensemble Differential Convolution Attention (EDCA), which propagates long-range contextual dependencies and seamlessly incorporates spatial distribution for RGB data. The DCA and EDCA modules dynamically adjust convolutional weights based on pixel differences, enabling self-adaptation in the local and long-range contexts, respectively. We construct a two-branch network, named the Differential Convolutional Network (DCANet), using the DCA and EDCA modules to fuse the local and global information from the two-modal data. Asa result, the individual advantages of RGB and depth data are emphasized. Experimental results demonstrate that our DCANet achieves anew state-of-the-art performance for RGB-D semantic segmentation on two challenging benchmark datasets: NYUv2 and SUN-RGBD.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] SCN: Switchable Context Network for Semantic Segmentation of RGB-D Images
    Lin, Di
    Zhang, Ruimao
    Ji, Yuanfeng
    Li, Ping
    Huang, Hui
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (03) : 1120 - 1131
  • [32] Joining geometric and RGB features for RGB-D semantic segmentation
    Zhang, Shaopeng
    Zhong, Min
    Zeng, Gang
    Gan, Rui
    2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
  • [33] 3D Neighborhood Convolution: Learning Depth-Aware Features for RGB-D and RGB Semantic Segmentation
    Chen, Yunlu
    Mensink, Thomas
    Gavves, Efstratios
    2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, : 173 - 182
  • [34] Interactive Efficient Multi-Task Network for RGB-D Semantic Segmentation
    Xu, Xinhua
    Liu, Jinfu
    Liu, Hong
    ELECTRONICS, 2023, 12 (18)
  • [35] Transformer fusion for indoor RGB-D semantic segmentation
    Wu, Zongwei
    Zhou, Zhuyun
    Allibert, Guillaume
    Stolz, Christophe
    Demonceaux, Cedric
    Ma, Chao
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 249
  • [36] DEPTH REMOVAL DISTILLATION FOR RGB-D SEMANTIC SEGMENTATION
    Fang, Tiyu
    Liang, Zhen
    Shao, Xiuli
    Dong, Zihao
    Li, Jinping
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2405 - 2409
  • [37] Edge-Aware Convolution for RGB-D Image Segmentation
    Chen, Rongsen
    Zhang, Fang-Lue
    Rhee, Taehyun
    2020 35TH INTERNATIONAL CONFERENCE ON IMAGE AND VISION COMPUTING NEW ZEALAND (IVCNZ), 2020,
  • [38] COUPLING TWO-STREAM RGB-D SEMANTIC SEGMENTATION NETWORK BY IDEMPOTENT MAPPINGS
    Xing, Yajie
    Wang, Jingbo
    Chen, Xiaokang
    Zeng, Gang
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 1850 - 1854
  • [39] EFDCNet: Encoding fusion and decoding correction network for RGB-D indoor semantic segmentation
    Chen, Jianlin
    Li, Gongyang
    Zhang, Zhijiang
    Zeng, Dan
    IMAGE AND VISION COMPUTING, 2024, 142
  • [40] MGCNet: Multilevel Gated Collaborative Network for RGB-D Semantic Segmentation of Indoor Scene
    Yang, Enquan
    Zhou, Wujie
    Qian, Xionghong
    Yu, Lu
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2567 - 2571