On Exploring Shape and Semantic Enhancements for RGB-X Semantic Segmentation

被引：2

作者：

Yang, Yuanjian ^{[1
]}

Shan, Caifeng ^{[1
,2
]}

Zhao, Fang ^{[2
]}

Liang, Wenli

Han, Jungong ^{[3
]}

机构：

[1] Shandong Univ Sci & Technol, Coll Elect Engn & Automat, Qingdao 266590, Peoples R China

[2] Nanjing Univ, Sch Intelligence Sci & Technol, Nanjing 210023, Peoples R China

[3] Univ Sheffield, Dept Comp Sci, Sheffield S10 2TN, S Yorkshire, England

来源：

IEEE TRANSACTIONS ON INTELLIGENT VEHICLES | 2024年 / 9卷 / 01期

关键词：

Decoding; Shape; Semantics; Semantic segmentation; Feature extraction; Fuses; Convolution; Deep supervision; inter-pixel relationship; RGB-X semantic segmentation; signed distance map; NETWORK; FUSION;

D O I：

10.1109/TIV.2023.3296219

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The robustness of scene segmentation can be enhanced with the aid of other modality information, e.g., thermal or/and depth, under poor environmental conditions. In this context, RGB-X semantic segmentation is becoming prevalent. Most existing RGB-X semantic segmentation models focus on the fusion strategy between different modalities or between multiple stages, but ignore feature recovery at the decoder side. This makes it difficult to recover the information loss due to downsampling, and also overlooks the pixel connections between segmented objects. To solve these problems, we propose a Shape and Semantic Enhancements Module (SASEM) in this article, which is characterized by innovations on the decoder side. More specifically, we divide the decoder into a shape supervision branch and a semantic supervision branch. The former reinforces the shape information of the category by using a signed distance map. A multi-stage enhancement structure is designed to further strengthen the shape information of features. The latter directly enhances the semantic extraction capability of the decoder by employing a channel-level semantic enhancement module, which reduces the interference of the semantic information by the shape supervision branch. The two branches work together to enhance the inter-pixel relationship, thus making the decoder more capable of recovering the fused encoded features. Our proposed SASEM serves as an excellent plug-and-play module for different networks, as is evident by the experiments on various RGB-Thermal and RGB-Depth datasets, where our module can be easily integrated and help to improve the performance consistently.

引用

页码：2223 / 2235

页数：13

共 50 条

[1] CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation With Transformers
Zhang, Jiaming
Liu, Huayao
Yang, Kailun
Hu, Xinxin
Liu, Ruiping
Stiefelhagen, Rainer
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (12) : 14679 - 14694
[2] Exploring the Applicability of Spectral Recovery in Semantic Segmentation of RGB Images
Du, Zhuoran
Wei, Shikui
Liu, Ting
Zhang, Shunli
Chen, Xiaotong
Zhang, Shiyin
Zhao, Yao
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1932 - 1943
[3] RGB-D SEMANTIC SEGMENTATION: A REVIEW
Hu, Yaosi
Chen, Zhenzhong
Lin, Weiyao
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
[4] CAFseg: A Semantic segmentation network with cross aggregation fusion strategy for RGB-thermal semantic segmentation
Yi, Shi
Wu, Lang
Liu, Xi
Li, Junjie
Jiang, Gang
[J]. INFRARED PHYSICS & TECHNOLOGY, 2024, 136
[5] Joining geometric and RGB features for RGB-D semantic segmentation
Zhang, Shaopeng
Zhong, Min
Zeng, Gang
Gan, Rui
[J]. 2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
[6] Salient Semantic Segmentation Based on RGB-D Camera for Robot Semantic Mapping
Hu, Lihe
Zhang, Yi
Wang, Yang
Yang, Huan
Tan, Shuyi
[J]. APPLIED SCIENCES-BASEL, 2023, 13 (06):
[7] SGFNet: Semantic-Guided Fusion Network for RGB-Thermal Semantic Segmentation
WangLi, Yike
Li, Gongyang
Liu, Zhi
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7737 - 7748
[8] RGB-X Classification for Electronics Sorting
Abhimanyu, F. N. U.
Zodage, Tejas
Thillaivasan, Umesh
Lai, Xinyue
Chakwate, Rahul
Santillan, Javier
Oti, Emma
Zhao, Ming
Boirum, Ralph
Choset, Howie
Travers, Matthew
[J]. 2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 5973 - 5980
[9] ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation
Cao, Jinming
Leng, Hanchao
Lischinski, Dani
Cohen-Or, Danny
Tu, Changhe
Li, Yangyan
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7068 - 7077
[10] RGB-T Semantic Segmentation With Location, Activation, and Sharpening
Li, Gongyang
Wang, Yike
Liu, Zhi
Zhang, Xinpeng
Zeng, Dan
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) : 1223 - 1235

← 1 2 3 4 5 →