Three-stream interaction decoder network for RGB-thermal salient object detection

被引:7
|
作者
Huo, Fushuo [1 ]
Zhu, Xuegui [1 ]
Li, Bingheng [2 ]
机构
[1] Chongqing Univ, State Key Lab Power Transmiss Equipment & Syst Sec, Chongqing 400044, Peoples R China
[2] Xidian Univ, Sch Elect Engn, Xian 710071, Peoples R China
关键词
Salient object detection; RGB-thermal; Multimodal fusion; Contextual information; Three-stream decoder;
D O I
10.1016/j.knosys.2022.110007
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Salient Object Detection (SOD) has witnessed remarkable improvement during the past decade. However, RGB-based SOD methods may fail for real-world applications in some extreme environments like low-light conditions and cluttered backgrounds. Thermal (T) images can capture the heat radiation from the surface of the objects and overcome such extreme situations. Therefore, some researchers introduce the T modality to the SOD task. Existing RGB-T SOD methods fail to explicitly explore multiscale complementary saliency cues from dual modalities and lack the full explorations of individual RGB and T modalities. To deal with such problems, we propose the Three-stream Interaction Decoder Network (TIDNet) for the RGB-T SOD task. Specifically, the feature maps from the encoder branches are fed to the three-stream interaction decoder for in-depth saliency exploration, catching the single modality and multi-modality saliency cues. For single modality decoder streams, Contextual-enhanced Channel Reduction units (CCR) firstly reduce the channel dimension of feature maps from RGB and T modalities, reducing the computational burden and discriminatively enriching the multi-scale information. For the multi-modality decoder stream, Multi-scale Cross Modality Fusion (MCMF) unit is proposed to explore the complementary multi-scale information from RGB and T modalities. Then Internal and Multiple Decoder Interaction (IMDI) units further dig the specified and complementary saliency cues from the three-stream decoder. Three-stream deep supervision has been deployed on each feature level to facilitate the training strategy. Comprehensive experiments show our method outperforms fifteen state-of-the-art methods in terms of seven metrics. The codes and models are available at https://github.com/huofushuo/TIDNet. (c) 2022 Published by Elsevier B.V.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Three-Stream Attention-Aware Network for RGB-D Salient Object Detection
    Chen, Hao
    Li, Youfu
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) : 2825 - 2835
  • [2] Mirror complementary transformer network for RGB-thermal salient object detection
    Jiang, Xiurong
    Hou, Yifan
    Tian, Hui
    Zhu, Lin
    [J]. IET COMPUTER VISION, 2024, 18 (01) : 15 - 32
  • [3] Multi-Interactive Dual-Decoder for RGB-Thermal Salient Object Detection
    Tu, Zhengzheng
    Li, Zhun
    Li, Chenglong
    Lang, Yang
    Tang, Jin
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5678 - 5691
  • [4] Real-Time One-Stream Semantic-Guided Refinement Network for RGB-Thermal Salient Object Detection
    Huo, Fushuo
    Zhu, Xuegui
    Zhang, Qian
    Liu, Ziming
    Yu, Wenchao
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2022, 71
  • [5] Cross-Collaborative Fusion-Encoder Network for Robust RGB-Thermal Salient Object Detection
    Liao, Guibiao
    Gao, Wei
    Li, Ge
    Wang, Junle
    Kwong, Sam
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (11) : 7646 - 7661
  • [6] Position-Aware Relation Learning for RGB-Thermal Salient Object Detection
    Zhou, Heng
    Tian, Chunna
    Zhang, Zhenxi
    Li, Chengyang
    Ding, Yuxuan
    Xie, Yongqiang
    Li, Zhongbo
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 2593 - 2607
  • [7] Three-Stream Cross-Modal Feature Aggregation Network for Light Field Salient Object Detection
    Wang, Anzhi
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 46 - 50
  • [8] Three-stream network with context convolution module for human-object interaction detection
    Siadari, Thomhert S.
    Han, Mikyong
    Yoon, Hyunjin
    [J]. ETRI JOURNAL, 2020, 42 (02) : 230 - 238
  • [9] CFIDNet: cascaded feature interaction decoder for RGB-D salient object detection
    Chen, Tianyou
    Hu, Xiaoguang
    Xiao, Jin
    Zhang, Guofeng
    Wang, Shaojie
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (10): : 7547 - 7563
  • [10] CFIDNet: cascaded feature interaction decoder for RGB-D salient object detection
    Tianyou Chen
    Xiaoguang Hu
    Jin Xiao
    Guofeng Zhang
    Shaojie Wang
    [J]. Neural Computing and Applications, 2022, 34 : 7547 - 7563