DCANet: Differential convolution attention network for RGB-D semantic segmentation

被引:0
|
作者
Bai, Lizhi [1 ]
Yang, Jun [1 ]
Tian, Chunqi [1 ]
Sun, Yaoru [1 ]
Mao, Maoyu [1 ]
Xu, Yanjun [1 ]
Xu, Weirong [1 ]
机构
[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Semantic segmentation; RGB-D; Differential convolution; Attention; SALIENCY;
D O I
10.1016/j.patcog.2025.111379
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Combining RGB images and their corresponding depth maps in semantic segmentation has proven to be effective in recent years. However, existing RGB-D modal fusion methods either lack non-linear feature fusion abilities or treat both modal images equally, disregarding the intrinsic distribution gap and information loss. In this study, we have observed that depth maps are well-suited for providing fine-grained patterns of objects due to their local depth continuity, while RGB images effectively offer a global view. Based on this observation, we propose a novel module called the pixel Differential Convolution Attention (DCA) module, which takes into account geometric information and local-range correlations for depth data. Additionally, we extend the DCA module to create the Ensemble Differential Convolution Attention (EDCA), which propagates long-range contextual dependencies and seamlessly incorporates spatial distribution for RGB data. The DCA and EDCA modules dynamically adjust convolutional weights based on pixel differences, enabling self-adaptation in the local and long-range contexts, respectively. We construct a two-branch network, named the Differential Convolutional Network (DCANet), using the DCA and EDCA modules to fuse the local and global information from the two-modal data. Asa result, the individual advantages of RGB and depth data are emphasized. Experimental results demonstrate that our DCANet achieves anew state-of-the-art performance for RGB-D semantic segmentation on two challenging benchmark datasets: NYUv2 and SUN-RGBD.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] FGMNet: Feature grouping mechanism network for RGB-D indoor scene semantic segmentation
    Zhang, Yuming
    Zhou, Wujie
    Ye, Lv
    Yu, Lu
    Luo, Ting
    DIGITAL SIGNAL PROCESSING, 2024, 149
  • [42] RGB×D: Learning depth-weighted RGB patches for RGB-D indoor semantic segmentation
    Cao, Jinming
    Leng, Hanchao
    Cohen-Or, Daniel
    Lischinski, Dani
    Chen, Ying
    Tu, Changhe
    Li, Yangyan
    Neurocomputing, 2021, 462 : 568 - 580
  • [43] Regularized Fully Convolutional Networks for RGB-D Semantic Segmentation
    Su, Wen
    Wang, Zengfu
    2016 30TH ANNIVERSARY OF VISUAL COMMUNICATION AND IMAGE PROCESSING (VCIP), 2016,
  • [44] Clothes Grasping and Unfolding Based on RGB-D Semantic Segmentation
    Zhu, Xingyu
    Wang, Xin
    Freer, Jonathan
    Chang, Hyung Jin
    Gao, Yixing
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 9471 - 9477
  • [45] Small Obstacle Avoidance Based on RGB-D Semantic Segmentation
    Hua, Minjie
    Nan, Yibing
    Lian, Shiguo
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 886 - 894
  • [46] Accurate semantic segmentation of RGB-D images for indoor navigation
    Sharan, Sudeep
    Nauth, Peter
    Dominguez-Jimenez, Juan-Jose
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)
  • [47] Non-Local Aggregation for RGB-D Semantic Segmentation
    Zhang, Guodong
    Xue, Jing-Hao
    Xie, Pengwei
    Yang, Sifan
    Wang, Guijin
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 658 - 662
  • [48] Learning Strengths and Weaknesses of Classifiers for RGB-D Semantic Segmentation
    Fooladgar, Fahimeh
    Kasaei, Shohreh
    2015 9TH IRANIAN CONFERENCE ON MACHINE VISION AND IMAGE PROCESSING (MVIP), 2015, : 176 - 179
  • [49] Semantic segmentation with Recurrent Neural Networks on RGB-D videos
    Gao, Chuan
    Wang, Weihong
    Chen, Mingxi
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 1203 - 1207
  • [50] Evaluation of Multimodal Semantic Segmentation using RGB-D Data
    Hu, Jiesi
    Zhao, Ganning
    You, Suya
    Kuo, C. C. Jay
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS III, 2021, 11746