S 3 Net: Self-Supervised Self-Ensembling Network for Semi-Supervised RGB-D Salient Object Detection

被引:4
|
作者
Zhu, Lei [1 ,2 ]
Wang, Xiaoqiang [3 ]
Li, Ping [4 ]
Yang, Xin [5 ]
Zhang, Qing [6 ]
Wang, Weiming [7 ]
Schonlieb, Carola-Bibiane [8 ]
Chen, C. L. Philip [9 ,10 ,11 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Univ Cambridge, Dept Appl Math & Theoret Phys DAMTP, Cambridge CB3 0WA, England
[3] Zhejiang Univ, Coll Comp Sci & Technol, Shatin, Hangzhou 310058, Peoples R China
[4] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong 00852, Peoples R China
[5] Dalian Univ Technol, Dept Comp Sci, Dalian 116024, Peoples R China
[6] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510006, Peoples R China
[7] Hong Kong Metropolitan Univ, Sch Sci & Technol, Ho Man Tin, Hong Kong 00852, Peoples R China
[8] Univ Cambridge, Dept Appl Math & Theoret Phys DAMTP, Cambridge CB3 0WA, England
[9] South China Univ Technol, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[10] Dalian Maritime Univ, Nav Coll, Dalian 116026, Peoples R China
[11] Univ Macau, Fac Sci & Technol, Macau 999078, Peoples R China
基金
中国国家自然科学基金;
关键词
Saliency detection; Feature extraction; Convolutional neural networks; Task analysis; Detectors; Object detection; Training; RGB-D salient object detection; self-supervised learning; semi-supervised learning; and cross-model and cross-level feature aggregation; SEGMENTATION; FUSION;
D O I
10.1109/TMM.2021.3129730
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
RGB-D salient object detection aims to detect visually distinctive objects or regions from a pair of the RGB image and the depth image. State-of-the-art RGB-D saliency detectors are mainly based on convolutional neural networks but almost suffer from an intrinsic limitation relying on the labeled data, thus degrading detection accuracy in complex cases. In this work, we present a self-supervised self-ensembling network (S-3 Net) for semi-supervised RGB-D salient object detection by leveraging the unlabeled data and exploring a self-supervised learning mechanism. To be specific, we first build a self-guided convolutional neural network (SG-CNN) as a baseline model by developing a series of three-layer cross-model feature fusion (TCF) modules to leverage complementary information among depth and RGB modalities and formulating an auxiliary task that predicts a self-supervised image rotation angle. After that, to further explore the knowledge from unlabeled data, we assign SG-CNN to a student network and a teacher network, and encourage the saliency predictions and self-supervised rotation predictions from these two networks to be consistent on the unlabeled data. Experimental results on seven widely-used benchmark datasets demonstrate that our network quantitatively and qualitatively outperforms the state-of-the-art methods.
引用
收藏
页码:676 / 689
页数:14
相关论文
共 50 条
  • [1] Self-Supervised Pretraining for RGB-D Salient Object Detection
    Zhao, Xiaoqi
    Pang, Youwei
    Zhang, Lihe
    Lu, Huchuan
    Ruan, Xiang
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3463 - 3471
  • [2] Temporal Self-Ensembling Teacher for Semi-Supervised Object Detection
    Chen, Cong
    Dong, Shouyang
    Tian, Ye
    Cao, Kunlin
    Liu, Li
    Guo, Yuanhao
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 3679 - 3692
  • [3] Self-supervised learning for RGB-D object tracking
    Zhu, Xue-Feng
    Xu, Tianyang
    Atito, Sara
    Awais, Muhammad
    Wu, Xiao-Jun
    Feng, Zhenhua
    Kittler, Josef
    PATTERN RECOGNITION, 2024, 155
  • [4] Semi-Supervised Learning for RGB-D Object Recognition
    Cheng, Yanhua
    Zhao, Xin
    Huang, Kaiqi
    Tan, Tieniu
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 2377 - 2382
  • [5] Self-supervised fusion network for RGB-D interest point detection and description
    Li, Ningning
    Wang, Xiaomin
    Zheng, Zhou
    Sun, Zhendong
    Pattern Recognition, 2025, 158
  • [6] Semi-supervised Learning by Disentangling and Self-ensembling over Stochastic Latent Space
    Gyawali, Prashnna Kumar
    Li, Zhiyuan
    Ghimire, Sandesh
    Wang, Linwei
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2019, PT VI, 2019, 11769 : 766 - 774
  • [7] Anchor-Based Self-Ensembling for Semi-Supervised Deep Pairwise Hashing
    Shi, Xiaoshuang
    Guo, Zhenhua
    Xing, Fuyong
    Liang, Yun
    Yang, Lin
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (8-9) : 2307 - 2324
  • [8] 3D Graph-S2Net: Shape-Aware Self-ensembling Network for Semi-supervised Segmentation with Bilateral Graph Convolution
    Huang, Huimin
    Zhou, Nan
    Lin, Lanfen
    Hu, Hongjie
    Iwamoto, Yutaro
    Han, Xian-Hua
    Chen, Yen-Wei
    Tong, Ruofeng
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT II, 2021, 12902 : 416 - 427
  • [9] Transferable Semi-Supervised 3D Object Detection From RGB-D Data
    Tang, Yew Siang
    Lee, Gim Hee
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1931 - 1940
  • [10] Anchor-Based Self-Ensembling for Semi-Supervised Deep Pairwise Hashing
    Xiaoshuang Shi
    Zhenhua Guo
    Fuyong Xing
    Yun Liang
    Lin Yang
    International Journal of Computer Vision, 2020, 128 : 2307 - 2324