UMINet: a unified multi-modality interaction network for RGB-D and RGB-T salient object detection

被引:0
|
作者
Lina Gao
Ping Fu
Mingzhu Xu
Tiantian Wang
Bing Liu
机构
[1] Harbin Institute of Technology,School of Electronics and Information Engineering
[2] Shandong University,School of Software
来源
The Visual Computer | 2024年 / 40卷
关键词
Salient object detection; RGB-D and RGB-T images; Asymmetric network; Multi-modality features fusion;
D O I
暂无
中图分类号
学科分类号
摘要
Multi-modality images with complementary cues can significantly improve the performance of salient object detection (SOD) methods in challenging scenes. However, existing methods are specially designed for RGB-D or RGB-T SOD in general, thus it is necessary to bridge the gap to develop a unified SOD framework for processing various multi-modality images. To address this issue, we propose a unified multi-modality interaction fusion framework for RGB-D and RGB-T SOD, named UMINet. We deeply investigate the differences between appearance maps and complementary images and design an asymmetric backbone to extract appearance features and complementary cues. For the complementary cues branch, a complementary information aware module (CIAM) is proposed to perceive and enhance the weights of complementary modality features. We also propose a multi-modality difference fusion (MDF) block to fuse cross-modality features. This MDF block simultaneously considers the differences and consistency between the appearance features and complementary features. Furthermore, to promote the rich contextual dependencies and integrate cross-level multi-modality features, we design a mutual refinement decoder (MRD) to progressively predict salient results. The MRD consists of three reverse perception blocks (RPB) and five sub-decoders. Extensive experiments are provided to indicate the substantial improvement achieved by the proposed UMINet over the existing state-of-the-art (SOTA) models on six RGB-D SOD datasets and three RGB-T SOD datasets.
引用
收藏
页码:1565 / 1582
页数:17
相关论文
共 50 条
  • [1] UMINet: a unified multi-modality interaction network for RGB-D and RGB-T salient object detection
    Gao, Lina
    Fu, Ping
    Xu, Mingzhu
    Wang, Tiantian
    Liu, Bing
    [J]. VISUAL COMPUTER, 2024, 40 (03): : 1565 - 1582
  • [2] Unified Information Fusion Network for Multi-Modal RGB-D and RGB-T Salient Object Detection
    Gao, Wei
    Liao, Guibiao
    Ma, Siwei
    Li, Ge
    Liang, Yongsheng
    Lin, Weisi
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 2091 - 2106
  • [3] Attention-guided Multi-modality Interaction Network for RGB-D Salient Object Detection
    Wang, Ruimin
    Wang, Fasheng
    Su, Yiming
    Sun, Jing
    Sun, Fuming
    Li, Haojie
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (03)
  • [4] Modality-Induced Transfer-Fusion Network for RGB-D and RGB-T Salient Object Detection
    Chen, Gang
    Shao, Feng
    Chai, Xiongli
    Chen, Hangwei
    Jiang, Qiuping
    Meng, Xiangchao
    Ho, Yo-Sung
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1787 - 1801
  • [5] Saliency Prototype for RGB-D and RGB-T Salient Object Detection
    Zhang, Zihao
    Wang, Jie
    Han, Yahong
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3696 - 3705
  • [6] MULTI-MODALITY DIVERSITY FUSION NETWORK WITH SWINTRANSFORMER FOR RGB-D SALIENT OBJECT DETECTION
    Duan, Songsong
    Xia, Chenxing
    Gao, Xiuju
    Ge, Bin
    Zhang, Hanling
    Li, Kuan-Ching
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1076 - 1080
  • [7] Multi-modality information refinement fusion network for RGB-D salient object detection
    Bao, Hua
    Fan, Bo
    [J]. VISUAL COMPUTER, 2024, 40 (06): : 4183 - 4199
  • [8] EFGNet: Encoder steered multi-modality feature guidance network for RGB-D salient object detection
    Xia, Chenxing
    Duan, Songsong
    Fang, Xianjin
    Gao, Xiuju
    Sun, Yanguang
    Ge, Bin
    Zhang, Hanling
    Li, Kuan-Ching
    [J]. DIGITAL SIGNAL PROCESSING, 2022, 131
  • [9] HDNet: Multi-Modality Hierarchy-Aware Decision Network for RGB-D Salient Object Detection
    Xia, Chengxing
    Duan, Songsong
    Ge, Bin
    Zhang, Hanling
    Li, Kuan-Ching
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2577 - 2581
  • [10] Enabling modality interactions for RGB-T salient object detection
    Zhang, Qiang
    Xi, Ruida
    Xiao, Tonglin
    Huang, Nianchang
    Luo, Yongjiang
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 222