Enhanced RGB-T saliency detection via thermal-guided multi-stage attention network

被引:0
|
作者
Pang, Yu [1 ]
Huang, Yang [1 ]
Weng, Chenyu [1 ]
Lyu, Jialin [1 ]
Bai, Chuanyue [1 ]
Yu, Xiaosheng [2 ]
机构
[1] Shenyang Univ Technol, Sch Artificial Intelligence, Shenyang 110870, Liaoning, Peoples R China
[2] Northeastern Univ, Fac Robot Sci & Engn, Shenyang 110169, Liaoning, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
RGB-T saliency detection; Single-stream network; Multi-stage framework; Modality-interaction; Attention mechanism; FUSION;
D O I
10.1007/s00371-025-03855-3
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Single-stream structures are prevalent in RGB-T saliency detection due to their efficiency and lightweight nature. However, existing multi-modal single-stream methods suffer from limited detection performance, primarily due to inadequate exploitation of thermal modality's strengths. To address this, we propose a novel single-stream network called Thermal-induced Modality-interaction Multi-stage Attention Network (TMMANet). Our approach leverages thermal-induced attention mechanisms in both the encoder and decoder stages to effectively integrate RGB and thermal modalities. In the encoder, a Thermal-induced Modality-interaction Self-Attention mechanism is introduced to extract powerful cross-modal features. In the decoder, a Thermal-induced Modality-interaction Dual-Branch Attention mechanism is designed to generate accurate saliency predictions by constructing modality-aware integration of foreground and background branches. Extensive experiments demonstrate that TMMANet outperforms most state-of-the-art RGB-T, RGB and RGB-D methods under various evaluation metrics, this highlights its effectiveness in enhancing RGB-T saliency detection performance. The related data of our TMMANet are released at https://github.com/SUTPangYu/TMMANet.
引用
收藏
页数:19
相关论文
共 45 条
  • [21] Attention Guided Food Recognition via Multi-Stage Local Feature Fusion
    Deng, Gonghui
    Wu, Dunzhi
    Chen, Weizhen
    CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 80 (02): : 1985 - 2003
  • [22] UMINet: a unified multi-modality interaction network for RGB-D and RGB-T salient object detection
    Lina Gao
    Ping Fu
    Mingzhu Xu
    Tiantian Wang
    Bing Liu
    The Visual Computer, 2024, 40 : 1565 - 1582
  • [23] Multi-stage Attention-Pooling Network for Lane Marking Detection
    Bachu, Saketh
    Garg, Tushar
    Panda, Deepak
    Reddy, Mallikarjuna
    Ibrahim, Shaikh
    Bhat, Bharath
    2022 12TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS (ICPRS), 2022,
  • [24] CAE-Net: Cross-Modal Attention Enhancement Network for RGB-T Salient Object Detection
    Lv, Chengtao
    Wan, Bin
    Zhou, Xiaofei
    Sun, Yaoqi
    Hu, Ji
    Zhang, Jiyong
    Yan, Chenggang
    ELECTRONICS, 2023, 12 (04)
  • [25] PATNet: Patch-to-pixel attention-aware transformer network for RGB-D and RGB-T salient object detection
    Jiang, Mingfeng
    Ma, Jianhua
    Chen, Jiatong
    Wang, Yaming
    Fang, Xian
    KNOWLEDGE-BASED SYSTEMS, 2024, 291
  • [26] M2FNet: Mask-Guided Multi-Level Fusion for RGB-T Pedestrian Detection
    Li, Xiangyang
    Chen, Shiguo
    Tian, Chunna
    Zhou, Heng
    Zhang, Zhenxi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8678 - 8690
  • [27] Enhanced rolling bearing fault diagnosis using a multi-stage attention fusion network
    Ma, Mingyuan
    Qu, Chenxi
    Zhao, Xudong
    Li, Fenglei
    Qu, Shengguan
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART C-JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 2025,
  • [28] M3S-NIR: Multi-Modal Multi-Scale Noise-Insensitive Ranking for RGB-T Saliency Detection
    Tu, Zhengzheng
    Xia, Tian
    Li, Chenglong
    Lu, Yijuan
    Tang, Jin
    2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 141 - 146
  • [29] MMNet: Multi-Stage and Multi-Scale Fusion Network for RGB-D Salient Object Detection
    Liao, Guibiao
    Gao, Wei
    Jiang, Qiuping
    Wang, Ronggang
    Li, Ge
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2436 - 2444
  • [30] MSEDNet: Multi-scale fusion and edge-supervised network for RGB-T salient object detection
    Peng, Daogang
    Zhou, Weiyi
    Pan, Junzhen
    Wang, Danhao
    NEURAL NETWORKS, 2024, 171 : 410 - 422