Perceptual localization and focus refinement network for RGB-D salient object detection

被引：0

作者：

Han, Jinyu ^{[1
]}

Wang, Mengyin ^{[1
]}

Wu, Weiyi ^{[1
]}

Jia, Xu ^{[2
]}

机构：

[1] Dalian Minzu Univ, Sch Informat & Commun Engn, Dalian 116600, Peoples R China

[2] Liaoning Univ Technol, Sch Elect & Informat Engn, Jinzhou 121001, Peoples R China

来源：

EXPERT SYSTEMS WITH APPLICATIONS | 2025年 / 259卷

基金：

中国国家自然科学基金;

关键词：

Salient object detection; RGB-D; Multi-level; Cross-modal; Fusion network; IMAGE;

D O I：

10.1016/j.eswa.2024.125278

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

RGB-D salient object detection task still encounters three challenges: (1) how to effectively integrate superior information from different modalities, (2) how to effectively mine common information of features at different levels, and (3) how to detect salient objects in complex scenes, such as complex backgrounds, low-quality depth maps, small targets, and high foreground-background similarity. To address the above challenges, we propose a novel Perceptual Localization and Focus Refinement Network, termed PLFRNet, based on the mechanism of human visual capture of salient objects in images. The network includes three key components: an encoder, a Perceptual Localization Module (PLM), and a Focus-Refinement Decoder (FRD). Specifically, we first adopt a two-stream asymmetric Pyramid Visual Transformer as the encoder to extract RGB and depth features. Then, we develop the PLM under the guidance of a Perceptual Localization Unit (PLU) delicately designed. This module can mine the common information of features at different levels and integrate the advantageous information from multiple modalities to localize salient objects. Finally, we propose the FRD focusing on detailed information guided by the attention mechanism. Furthermore, it further refines the located objects by gradually interacting with low-level features to achieve salient object detection. Extensive experimental results show that this method achieves state-of-the-art performance compared with 13 RGB-D models on 6 public datasets. The codes are released at https://github.com/hjy0518/PLFRNet/.

引用

页数：14

共 50 条

[41] Depth cue enhancement and guidance network for RGB-D salient object detection
Li, Xiang
Zhang, Qing
Yan, Weiqi
Dai, Meng
[J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
[42] Depth-aware lightweight network for RGB-D salient object detection
Ling, Liuyi
Wang, Yiwen
Wang, Chengjun
Xu, Shanyong
Huang, Yourui
[J]. IET IMAGE PROCESSING, 2023, 17 (08) : 2350 - 2361
[43] JALNet: joint attention learning network for RGB-D salient object detection
Gao, Xiuju
Cui, Jianhua
Meng, Jin
Shi, Huaizhong
Duan, Songsong
Xia, Chenxing
[J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL SCIENCE AND ENGINEERING, 2024, 27 (01) : 36 - 47
[44] LIANet: Layer Interactive Attention Network for RGB-D Salient Object Detection
Han, Yibo
Wang, Liejun
Du, Anyu
Jiang, Shaochen
[J]. IEEE ACCESS, 2022, 10 : 25435 - 25447
[45] GCENet: Global contextual exploration network for RGB-D salient object detection
Xia, Chenxing
Duan, Songsong
Gao, Xiuju
Sun, Yanguang
Huang, Rongmei
Ge, Bin
[J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 89
[46] A deep multimodal feature learning network for RGB-D salient object detection
Liang, Fangfang
Duan, Lijuan
Ma, Wei
Qiao, Yuanhua
Miao, Jun
[J]. COMPUTERS & ELECTRICAL ENGINEERING, 2021, 92
[47] Dynamic Message Propagation Network for RGB-D and Video Salient Object Detection
Chen, Baian
Chen, Zhilei
Hu, Xiaowei
Xu, Jun
Xie, Haoran
Qin, Jing
Wei, Mingqiang
[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (01)
[48] ICNet: Information Conversion Network for RGB-D Based Salient Object Detection
Li, Gongyang
Liu, Zhi
Ling, Haibin
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 4873 - 4884
[49] CATNet: A Cascaded and Aggregated Transformer Network for RGB-D Salient Object Detection
Sun, Fuming
Ren, Peng
Yin, Bowen
Wang, Fasheng
Li, Haojie
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 2249 - 2262
[50] Aggregate interactive learning for RGB-D salient object detection
Wu, Jingyu
Sun, Fuming
Xu, Rui
Meng, Jie
Wang, Fasheng
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 195

← 1 2 3 4 5 →