Improving RGB-D Salient Object Detection via Modality-Aware Decoder

被引：0

作者：

Song, Mengke ^{[1
,2
]}

Song, Wenfeng ^{[3
]}

Yang, Guowei ^{[4
]}

Chen, Chenglizhao ^{[1
,2
]}

机构：

[1] China Univ Petr East China, Coll Comp Sci & Technol, Qingdao 266580, Peoples R China

[2] China Univ Petr East China, Qingdao Inst Software, Qingdao 266580, Peoples R China

[3] Beijing Informat Sci & Technol Univ, Comp Sch, Beijing 100192, Peoples R China

[4] Qingdao Univ, Sch Elect Informat, Qingdao 266071, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2022年 / 31卷

基金：

中国国家自然科学基金;

关键词：

Decoding; Object detection; Training; Task analysis; Saliency detection; Image segmentation; Feature extraction; RGB-D salient object detection; modality-aware fusion; deep learning; GRAPH CONVOLUTION NETWORK; IMAGE; ATTENTION; FUSION;

D O I：

10.1109/TIP.2022.3205747

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most existing RGB-D salient object detection (SOD) methods are primarily focusing on cross-modal and cross-level saliency fusion, which has been proved to be efficient and effective. However, these methods still have a critical limitation, i.e., their fusion patterns - typically the combination of selective characteristics and its variations, are too highly dependent on the network's non-linear adaptability. In such methods, the balances between RGB and D (Depth) are formulated individually considering the intermediate feature slices, but the relation at the modality level may not be learned properly. The optimal RGB-D combinations differ depending on the RGB-D scenarios, and the exact complementary status is frequently determined by multiple modality-level factors, such as D quality, the complexity of the RGB scene, and degree of harmony between them. Therefore, given the existing approaches, it may be difficult for them to achieve further performance breakthroughs, as their methodologies belong to some methods that are somewhat less modality sensitive. To conquer this problem, this paper presents the Modality-aware Decoder (MaD). The critical technical innovations include a series of feature embedding, modality reasoning, and feature back-projecting and collecting strategies, all of which upgrade the widely-used multi-scale and multi-level decoding process to be modality-aware. Our MaD achieves competitive performance over other state-of-the-art (SOTA) models without using any fancy tricks in the decoder's design. Codes and results will be publicly available at https://github.com/MengkeSong/MaD.

引用

页码：6124 / 6138

页数：15

共 50 条

[41] DYNAMIC SELECTION NETWORK FOR RGB-D SALIENT OBJECT DETECTION
Zhou, Jinlin
Luo, Zhiming
Li, Shaozi
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 776 - 780
[42] RGB-D Salient Object Detection With Ubiquitous Target Awareness
Zhao, Yifan
Zhao, Jiawei
Li, Jia
Chen, Xiaowu
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7717 - 7731
[43] A salient object detection algorithm based on RGB-D images
Song, Can
Wu, Jin
Deng, Huiping
Zhu, Lei
[J]. 2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 1692 - 1697
[44] MobileSal: Extremely Efficient RGB-D Salient Object Detection
Wu, Yu-Huan
Liu, Yun
Xu, Jun
Bian, Jia-Wang
Gu, Yu-Chao
Cheng, Ming-Ming
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (12) : 10261 - 10269
[45] Bifurcation Fusion Network for RGB-D Salient Object Detection
Zhao, Zhi-Hua
Chen, Li
[J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2022, 31 (12)
[46] Adaptive fusion network for RGB-D salient object detection
Chen, Tianyou
Xiao, Jin
Hu, Xiaoguang
Zhang, Guofeng
Wang, Shaojie
[J]. NEUROCOMPUTING, 2023, 522 : 152 - 164
[47] Modality-Induced Transfer-Fusion Network for RGB-D and RGB-T Salient Object Detection
Chen, Gang
Shao, Feng
Chai, Xiongli
Chen, Hangwei
Jiang, Qiuping
Meng, Xiangchao
Ho, Yo-Sung
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1787 - 1801
[48] UMINet: a unified multi-modality interaction network for RGB-D and RGB-T salient object detection
Lina Gao
Ping Fu
Mingzhu Xu
Tiantian Wang
Bing Liu
[J]. The Visual Computer, 2024, 40 : 1565 - 1582
[49] UMINet: a unified multi-modality interaction network for RGB-D and RGB-T salient object detection
Gao, Lina
Fu, Ping
Xu, Mingzhu
Wang, Tiantian
Liu, Bing
[J]. VISUAL COMPUTER, 2024, 40 (03): : 1565 - 1582
[50] Saliency Prototype for RGB-D and RGB-T Salient Object Detection
Zhang, Zihao
Wang, Jie
Han, Yahong
[J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3696 - 3705

← 1 2 3 4 5 →