Enhanced RGB-T saliency detection via thermal-guided multi-stage attention network

被引:0
|
作者
Pang, Yu [1 ]
Huang, Yang [1 ]
Weng, Chenyu [1 ]
Lyu, Jialin [1 ]
Bai, Chuanyue [1 ]
Yu, Xiaosheng [2 ]
机构
[1] Shenyang Univ Technol, Sch Artificial Intelligence, Shenyang 110870, Liaoning, Peoples R China
[2] Northeastern Univ, Fac Robot Sci & Engn, Shenyang 110169, Liaoning, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
RGB-T saliency detection; Single-stream network; Multi-stage framework; Modality-interaction; Attention mechanism; FUSION;
D O I
10.1007/s00371-025-03855-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Single-stream structures are prevalent in RGB-T saliency detection due to their efficiency and lightweight nature. However, existing multi-modal single-stream methods suffer from limited detection performance, primarily due to inadequate exploitation of thermal modality's strengths. To address this, we propose a novel single-stream network called Thermal-induced Modality-interaction Multi-stage Attention Network (TMMANet). Our approach leverages thermal-induced attention mechanisms in both the encoder and decoder stages to effectively integrate RGB and thermal modalities. In the encoder, a Thermal-induced Modality-interaction Self-Attention mechanism is introduced to extract powerful cross-modal features. In the decoder, a Thermal-induced Modality-interaction Dual-Branch Attention mechanism is designed to generate accurate saliency predictions by constructing modality-aware integration of foreground and background branches. Extensive experiments demonstrate that TMMANet outperforms most state-of-the-art RGB-T, RGB and RGB-D methods under various evaluation metrics, this highlights its effectiveness in enhancing RGB-T saliency detection performance. The related data of our TMMANet are released at https://github.com/SUTPangYu/TMMANet.
引用
收藏
页数:19
相关论文
共 45 条
  • [1] Multi-enhanced Adaptive Attention Network for RGB-T Salient Object Detection
    Hao, Hao-Zhou
    Cheng, Yao
    Ji, Yi
    Li, Ying
    Liu, Chun-Ping
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [2] MMNet: Multi-modal multi-stage network for RGB-T image semantic segmentation
    Lan, Xin
    Gu, Xiaojing
    Gu, Xingsheng
    APPLIED INTELLIGENCE, 2022, 52 (05) : 5817 - 5829
  • [3] MMNet: Multi-modal multi-stage network for RGB-T image semantic segmentation
    Xin Lan
    Xiaojing Gu
    Xingsheng Gu
    Applied Intelligence, 2022, 52 : 5817 - 5829
  • [4] RGB-T Image Saliency Detection via Collaborative Graph Learning
    Tu, Zhengzheng
    Xia, Tian
    Li, Chenglong
    Wang, Xiaoxiao
    Ma, Yan
    Tang, Jin
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (01) : 160 - 173
  • [5] EDGE-Net: an edge-guided enhanced network for RGB-T salient object detection
    Zheng, Xin
    Wang, Boyang
    Ai, Liefu
    Tang, Pan
    Liu, Deyang
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (06)
  • [6] Middle fusion and multi-stage, multi-form prompts for robust RGB-T tracking
    Wang, Qiming
    Bai, Yongqiang
    Song, Hongxing
    NEUROCOMPUTING, 2024, 596
  • [7] Weighted Guided Optional Fusion Network for RGB-T Salient Object Detection
    Wang, Jie
    Li, Guoqiang
    Shi, Jie
    Xi, Jinwen
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (05)
  • [8] RGB-T salient object detection via CNN feature and result saliency map fusion
    Chang Xu
    Qingwu Li
    Mingyu Zhou
    Qingkai Zhou
    Yaqin Zhou
    Yunpeng Ma
    Applied Intelligence, 2022, 52 : 11343 - 11362
  • [9] RGB-T salient object detection via CNN feature and result saliency map fusion
    Xu, Chang
    Li, Qingwu
    Zhou, Mingyu
    Zhou, Qingkai
    Zhou, Yaqin
    Ma, Yunpeng
    APPLIED INTELLIGENCE, 2022, 52 (10) : 11343 - 11362
  • [10] CFRNet: Cross-Attention-Based Fusion and Refinement Network for Enhanced RGB-T Salient Object Detection
    Deng, Biao
    Liu, Di
    Cao, Yang
    Liu, Hong
    Yan, Zhiguo
    Chen, Hu
    SENSORS, 2024, 24 (22)