On Exploring Shape and Semantic Enhancements for RGB-X Semantic Segmentation

被引:2
|
作者
Yang, Yuanjian [1 ]
Shan, Caifeng [1 ,2 ]
Zhao, Fang [2 ]
Liang, Wenli
Han, Jungong [3 ]
机构
[1] Shandong Univ Sci & Technol, Coll Elect Engn & Automat, Qingdao 266590, Peoples R China
[2] Nanjing Univ, Sch Intelligence Sci & Technol, Nanjing 210023, Peoples R China
[3] Univ Sheffield, Dept Comp Sci, Sheffield S10 2TN, S Yorkshire, England
来源
关键词
Decoding; Shape; Semantics; Semantic segmentation; Feature extraction; Fuses; Convolution; Deep supervision; inter-pixel relationship; RGB-X semantic segmentation; signed distance map; NETWORK; FUSION;
D O I
10.1109/TIV.2023.3296219
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The robustness of scene segmentation can be enhanced with the aid of other modality information, e.g., thermal or/and depth, under poor environmental conditions. In this context, RGB-X semantic segmentation is becoming prevalent. Most existing RGB-X semantic segmentation models focus on the fusion strategy between different modalities or between multiple stages, but ignore feature recovery at the decoder side. This makes it difficult to recover the information loss due to downsampling, and also overlooks the pixel connections between segmented objects. To solve these problems, we propose a Shape and Semantic Enhancements Module (SASEM) in this article, which is characterized by innovations on the decoder side. More specifically, we divide the decoder into a shape supervision branch and a semantic supervision branch. The former reinforces the shape information of the category by using a signed distance map. A multi-stage enhancement structure is designed to further strengthen the shape information of features. The latter directly enhances the semantic extraction capability of the decoder by employing a channel-level semantic enhancement module, which reduces the interference of the semantic information by the shape supervision branch. The two branches work together to enhance the inter-pixel relationship, thus making the decoder more capable of recovering the fused encoded features. Our proposed SASEM serves as an excellent plug-and-play module for different networks, as is evident by the experiments on various RGB-Thermal and RGB-Depth datasets, where our module can be easily integrated and help to improve the performance consistently.
引用
收藏
页码:2223 / 2235
页数:13
相关论文
共 50 条
  • [1] CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation With Transformers
    Zhang, Jiaming
    Liu, Huayao
    Yang, Kailun
    Hu, Xinxin
    Liu, Ruiping
    Stiefelhagen, Rainer
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (12) : 14679 - 14694
  • [2] Exploring the Applicability of Spectral Recovery in Semantic Segmentation of RGB Images
    Du, Zhuoran
    Wei, Shikui
    Liu, Ting
    Zhang, Shunli
    Chen, Xiaotong
    Zhang, Shiyin
    Zhao, Yao
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 1932 - 1943
  • [3] RGB-D SEMANTIC SEGMENTATION: A REVIEW
    Hu, Yaosi
    Chen, Zhenzhong
    Lin, Weiyao
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
  • [4] CAFseg: A Semantic segmentation network with cross aggregation fusion strategy for RGB-thermal semantic segmentation
    Yi, Shi
    Wu, Lang
    Liu, Xi
    Li, Junjie
    Jiang, Gang
    [J]. INFRARED PHYSICS & TECHNOLOGY, 2024, 136
  • [5] Joining geometric and RGB features for RGB-D semantic segmentation
    Zhang, Shaopeng
    Zhong, Min
    Zeng, Gang
    Gan, Rui
    [J]. 2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
  • [6] Salient Semantic Segmentation Based on RGB-D Camera for Robot Semantic Mapping
    Hu, Lihe
    Zhang, Yi
    Wang, Yang
    Yang, Huan
    Tan, Shuyi
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (06):
  • [7] SGFNet: Semantic-Guided Fusion Network for RGB-Thermal Semantic Segmentation
    WangLi, Yike
    Li, Gongyang
    Liu, Zhi
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7737 - 7748
  • [8] RGB-X Classification for Electronics Sorting
    Abhimanyu, F. N. U.
    Zodage, Tejas
    Thillaivasan, Umesh
    Lai, Xinyue
    Chakwate, Rahul
    Santillan, Javier
    Oti, Emma
    Zhao, Ming
    Boirum, Ralph
    Choset, Howie
    Travers, Matthew
    [J]. 2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 5973 - 5980
  • [9] ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation
    Cao, Jinming
    Leng, Hanchao
    Lischinski, Dani
    Cohen-Or, Danny
    Tu, Changhe
    Li, Yangyan
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7068 - 7077
  • [10] RGB-T Semantic Segmentation With Location, Activation, and Sharpening
    Li, Gongyang
    Wang, Yike
    Liu, Zhi
    Zhang, Xinpeng
    Zeng, Dan
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) : 1223 - 1235