DCANet: Differential convolution attention network for RGB-D semantic segmentation

被引：0

作者：

Bai, Lizhi ^{[1
]}

Yang, Jun ^{[1
]}

Tian, Chunqi ^{[1
]}

Sun, Yaoru ^{[1
]}

Mao, Maoyu ^{[1
]}

Xu, Yanjun ^{[1
]}

Xu, Weirong ^{[1
]}

机构：

[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China

来源：

PATTERN RECOGNITION | 2025年 / 162卷

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

Semantic segmentation; RGB-D; Differential convolution; Attention; SALIENCY;

D O I：

10.1016/j.patcog.2025.111379

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Combining RGB images and their corresponding depth maps in semantic segmentation has proven to be effective in recent years. However, existing RGB-D modal fusion methods either lack non-linear feature fusion abilities or treat both modal images equally, disregarding the intrinsic distribution gap and information loss. In this study, we have observed that depth maps are well-suited for providing fine-grained patterns of objects due to their local depth continuity, while RGB images effectively offer a global view. Based on this observation, we propose a novel module called the pixel Differential Convolution Attention (DCA) module, which takes into account geometric information and local-range correlations for depth data. Additionally, we extend the DCA module to create the Ensemble Differential Convolution Attention (EDCA), which propagates long-range contextual dependencies and seamlessly incorporates spatial distribution for RGB data. The DCA and EDCA modules dynamically adjust convolutional weights based on pixel differences, enabling self-adaptation in the local and long-range contexts, respectively. We construct a two-branch network, named the Differential Convolutional Network (DCANet), using the DCA and EDCA modules to fuse the local and global information from the two-modal data. Asa result, the individual advantages of RGB and depth data are emphasized. Experimental results demonstrate that our DCANet achieves anew state-of-the-art performance for RGB-D semantic segmentation on two challenging benchmark datasets: NYUv2 and SUN-RGBD.

引用

页数：11

共 50 条

[21] Correction to: Cascading context enhancement network for RGB-D semantic segmentation
Xu Tang
Zejun Zhang
Yan Meng
Jianxiao Xie
Changbing Tang
Weichuan Zhang
Multimedia Tools and Applications, 2025, 84 (9) : 6005 - 6005
[22] TSNet: Three-Stream Self-Attention Network for RGB-D Indoor Semantic Segmentation
Zhou, Wujie
Yuan, Jianzhong
Lei, Jingsheng
Luo, Ting
IEEE INTELLIGENT SYSTEMS, 2021, 36 (04) : 73 - 78
[23] GANet: geometry-aware network for RGB-D semantic segmentation
Tian, Chunqi
Xu, Weirong
Bai, Lizhi
Yang, Jun
Xu, Yanjun
APPLIED INTELLIGENCE, 2025, 55 (06)
[24] DDNet: Depth Dominant Network for Semantic Segmentation of RGB-D Images
Rong, Peizhi
SENSORS, 2024, 24 (21)
[25] Intra-inter Modal Attention Blocks for RGB-D Semantic Segmentation
Choi, Soyun
Zhang, Youjia
Hong, Sungeun
PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 217 - 225
[26] SPNet: An RGB-D Sequence Progressive Network for Road Semantic Segmentation
Zhou, Zhi
Zhang, Yuhang
Hua, Guoguang
Long, Ruijing
Tian, Shishun
Zou, Wenbin
2023 IEEE 25TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, MMSP, 2023,
[27] RGB-D Dual Modal Information Complementary Semantic Segmentation Network
Wang L.
Gu N.
Xin J.
Wang S.
Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2023, 35 (10): : 1489 - 1499
[28] RGB-D indoor semantic segmentation network based on wavelet transform
Runze Fan
Yuhong Liu
Shiyi Jiang
Rongfen Zhang
Evolving Systems, 2023, 14 : 981 - 991
[29] RGB-D indoor semantic segmentation network based on wavelet transform
Fan, Runze
Liu, Yuhong
Jiang, Shiyi
Zhang, Rongfen
EVOLVING SYSTEMS, 2023, 14 (06) : 981 - 991
[30] Zig-Zag Network for Semantic Segmentation of RGB-D Images
Lin, Di
Huang, Hui
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2020, 42 (10) : 2642 - 2655

← 1 2 3 4 5 →