Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding

被引:1
|
作者
Jiang, Li [1 ]
Yang, Zetong [2 ]
Shi, Shaoshuai [1 ]
Golyanik, Vladislav [1 ]
Dai, Dengxin [1 ]
Schiele, Bernt [1 ]
机构
[1] Saarland Informatics Campus, Max Planck Inst Informat, Saarbrucken, Germany
[2] CUHK, Hong Kong, Peoples R China
关键词
D O I
10.1109/CVPR52729.2023.00119
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Masked signal modeling has greatly advanced self-supervised pre-training for language and 2D images. However, it is still not fully explored in 3D scene understanding. Thus, this paper introduces Masked Shape Prediction (MSP), a new framework to conduct masked signal modeling in 3D scenes. MSP uses the essential 3D semantic cue, i.e., geometric shape, as the prediction target for masked points. The context-enhanced shape target consisting of explicit shape context and implicit deep shape feature is proposed to facilitate exploiting contextual cues in shape prediction. Meanwhile, the pre-training architecture in MSP is carefully designed to alleviate the masked shape leakage from point coordinates. Experiments on multiple 3D understanding tasks on both indoor and outdoor datasets demonstrate the effectiveness of MSP in learning good feature representations to consistently boost downstream performance.
引用
下载
收藏
页码:1168 / 1178
页数:11
相关论文
共 50 条
  • [1] Masked Feature Prediction for Self-Supervised Visual Pre-Training
    Wei, Chen
    Fan, Haoqi
    Xie, Saining
    Wu, Chao-Yuan
    Yuille, Alan
    Feichtenhofer, Christoph
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 14648 - 14658
  • [2] Masked Text Modeling: A Self-Supervised Pre-training Method for Scene Text Detection
    Wang, Keran
    Xie, Hongtao
    Wang, Yuxin
    Zhang, Dongming
    Qu, Yadong
    Gao, Zuan
    Zhang, Yongdong
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2006 - 2015
  • [3] GeoMAE: Masked Geometric Target Prediction for Self-supervised Point Cloud Pre-Training
    Tian, Xiaoyu
    Ran, Haoxi
    Wang, Yue
    Zhao, Hang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 13570 - 13580
  • [4] Self-supervised Pre-training for Semantic Segmentation in an Indoor Scene
    Shrestha, Sulabh
    Li, Yimeng
    Kosecka, Jana
    2024 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS, WACVW 2024, 2024, : 625 - 635
  • [5] A Closer Look at Invariances in Self-supervised Pre-training for 3D Vision
    Li, Lanxiao
    Heizmann, Michael
    COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 : 656 - 673
  • [6] Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds
    Hess, Georg
    Jaxing, Johan
    Svensson, Elias
    Hagerman, David
    Petersson, Christoffer
    Svensson, Lennart
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WORKSHOPS (WACVW), 2023, : 350 - 359
  • [7] GMIM: Self-supervised pre-training for 3D medical image segmentation with adaptive and hierarchical masked image modeling
    Qi L.
    Jiang Z.
    Shi W.
    Qu F.
    Feng G.
    Computers in Biology and Medicine, 2024, 176
  • [8] Self-Supervised Pre-Training of Swin Transformers for 3D Medical Image Analysis
    Tang, Yucheng
    Yang, Dong
    Li, Wenqi
    Roth, Holger R.
    Landman, Bennett
    Xu, Daguang
    Nath, Vishwesh
    Hatamizadeh, Ali
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20698 - 20708
  • [9] Self-supervised ECG pre-training
    Liu, Han
    Zhao, Zhenbo
    She, Qiang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 70
  • [10] Geometric Visual Similarity Learning in 3D Medical Image Self-supervised Pre-training
    He, Yuting
    Yang, Guanyu
    Ge, Rongjun
    Chen, Yang
    Coatrieux, Jean-Louis
    Wang, Boyu
    Li, Shuo
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9538 - 9547