DCANet: Differential convolution attention network for RGB-D semantic segmentation

被引:0
|
作者
Bai, Lizhi [1 ]
Yang, Jun [1 ]
Tian, Chunqi [1 ]
Sun, Yaoru [1 ]
Mao, Maoyu [1 ]
Xu, Yanjun [1 ]
Xu, Weirong [1 ]
机构
[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Semantic segmentation; RGB-D; Differential convolution; Attention; SALIENCY;
D O I
10.1016/j.patcog.2025.111379
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Combining RGB images and their corresponding depth maps in semantic segmentation has proven to be effective in recent years. However, existing RGB-D modal fusion methods either lack non-linear feature fusion abilities or treat both modal images equally, disregarding the intrinsic distribution gap and information loss. In this study, we have observed that depth maps are well-suited for providing fine-grained patterns of objects due to their local depth continuity, while RGB images effectively offer a global view. Based on this observation, we propose a novel module called the pixel Differential Convolution Attention (DCA) module, which takes into account geometric information and local-range correlations for depth data. Additionally, we extend the DCA module to create the Ensemble Differential Convolution Attention (EDCA), which propagates long-range contextual dependencies and seamlessly incorporates spatial distribution for RGB data. The DCA and EDCA modules dynamically adjust convolutional weights based on pixel differences, enabling self-adaptation in the local and long-range contexts, respectively. We construct a two-branch network, named the Differential Convolutional Network (DCANet), using the DCA and EDCA modules to fuse the local and global information from the two-modal data. Asa result, the individual advantages of RGB and depth data are emphasized. Experimental results demonstrate that our DCANet achieves anew state-of-the-art performance for RGB-D semantic segmentation on two challenging benchmark datasets: NYUv2 and SUN-RGBD.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Correction to: Cascading context enhancement network for RGB-D semantic segmentation
    Xu Tang
    Zejun Zhang
    Yan Meng
    Jianxiao Xie
    Changbing Tang
    Weichuan Zhang
    Multimedia Tools and Applications, 2025, 84 (9) : 6005 - 6005
  • [22] TSNet: Three-Stream Self-Attention Network for RGB-D Indoor Semantic Segmentation
    Zhou, Wujie
    Yuan, Jianzhong
    Lei, Jingsheng
    Luo, Ting
    IEEE INTELLIGENT SYSTEMS, 2021, 36 (04) : 73 - 78
  • [23] GANet: geometry-aware network for RGB-D semantic segmentation
    Tian, Chunqi
    Xu, Weirong
    Bai, Lizhi
    Yang, Jun
    Xu, Yanjun
    APPLIED INTELLIGENCE, 2025, 55 (06)
  • [24] DDNet: Depth Dominant Network for Semantic Segmentation of RGB-D Images
    Rong, Peizhi
    SENSORS, 2024, 24 (21)
  • [25] Intra-inter Modal Attention Blocks for RGB-D Semantic Segmentation
    Choi, Soyun
    Zhang, Youjia
    Hong, Sungeun
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 217 - 225
  • [26] SPNet: An RGB-D Sequence Progressive Network for Road Semantic Segmentation
    Zhou, Zhi
    Zhang, Yuhang
    Hua, Guoguang
    Long, Ruijing
    Tian, Shishun
    Zou, Wenbin
    2023 IEEE 25TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, MMSP, 2023,
  • [27] RGB-D Dual Modal Information Complementary Semantic Segmentation Network
    Wang L.
    Gu N.
    Xin J.
    Wang S.
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (10): : 1489 - 1499
  • [28] RGB-D indoor semantic segmentation network based on wavelet transform
    Runze Fan
    Yuhong Liu
    Shiyi Jiang
    Rongfen Zhang
    Evolving Systems, 2023, 14 : 981 - 991
  • [29] RGB-D indoor semantic segmentation network based on wavelet transform
    Fan, Runze
    Liu, Yuhong
    Jiang, Shiyi
    Zhang, Rongfen
    EVOLVING SYSTEMS, 2023, 14 (06) : 981 - 991
  • [30] Zig-Zag Network for Semantic Segmentation of RGB-D Images
    Lin, Di
    Huang, Hui
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) : 2642 - 2655