Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding

被引:1
|
作者
Jiang, Li [1 ]
Yang, Zetong [2 ]
Shi, Shaoshuai [1 ]
Golyanik, Vladislav [1 ]
Dai, Dengxin [1 ]
Schiele, Bernt [1 ]
机构
[1] Saarland Informatics Campus, Max Planck Inst Informat, Saarbrucken, Germany
[2] CUHK, Hong Kong, Peoples R China
关键词
D O I
10.1109/CVPR52729.2023.00119
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Masked signal modeling has greatly advanced self-supervised pre-training for language and 2D images. However, it is still not fully explored in 3D scene understanding. Thus, this paper introduces Masked Shape Prediction (MSP), a new framework to conduct masked signal modeling in 3D scenes. MSP uses the essential 3D semantic cue, i.e., geometric shape, as the prediction target for masked points. The context-enhanced shape target consisting of explicit shape context and implicit deep shape feature is proposed to facilitate exploiting contextual cues in shape prediction. Meanwhile, the pre-training architecture in MSP is carefully designed to alleviate the masked shape leakage from point coordinates. Experiments on multiple 3D understanding tasks on both indoor and outdoor datasets demonstrate the effectiveness of MSP in learning good feature representations to consistently boost downstream performance.
引用
下载
收藏
页码:1168 / 1178
页数:11
相关论文
共 50 条
  • [31] Self-supervised pre-training on industrial time-series
    Biggio, Luca
    Kastanis, Iason
    2021 8TH SWISS CONFERENCE ON DATA SCIENCE, SDS, 2021, : 56 - 57
  • [32] SPAKT: A Self-Supervised Pre-TrAining Method for Knowledge Tracing
    Ma, Yuling
    Han, Peng
    Qiao, Huiyan
    Cui, Chaoran
    Yin, Yilong
    Yu, Dehu
    IEEE ACCESS, 2022, 10 : 72145 - 72154
  • [33] CDS: Cross-Domain Self-supervised Pre-training
    Kim, Donghyun
    Saito, Kuniaki
    Oh, Tae-Hyun
    Plummer, Bryan A.
    Sclaroff, Stan
    Saenko, Kate
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9103 - 9112
  • [34] FALL DETECTION USING SELF-SUPERVISED PRE-TRAINING MODEL
    Yhdego, Haben
    Audette, Michel
    Paolini, Christopher
    PROCEEDINGS OF THE 2022 ANNUAL MODELING AND SIMULATION CONFERENCE (ANNSIM'22), 2022, : 361 - 371
  • [35] DiT: Self-supervised Pre-training for Document Image Transformer
    Li, Junlong
    Xu, Yiheng
    Lv, Tengchao
    Cui, Lei
    Zhang, Cha
    Wei, Furu
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3530 - 3539
  • [36] MEASURING THE IMPACT OF DOMAIN FACTORS IN SELF-SUPERVISED PRE-TRAINING
    Sanabria, Ramon
    Wei-Ning, Hsu
    Alexei, Baevski
    Auli, Michael
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [37] Correlational Image Modeling for Self-Supervised Visual Pre-Training
    Li, Wei
    Xie, Jiahao
    Loy, Chen Change
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15105 - 15115
  • [38] Contrastive Self-Supervised Pre-Training for Video Quality Assessment
    Chen, Pengfei
    Li, Leida
    Wu, Jinjian
    Dong, Weisheng
    Shi, Guangming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 458 - 471
  • [39] Uni4Eye: Unified 2D and 3D Self-supervised Pre-training via Masked Image Modeling Transformer for Ophthalmic Image Classification
    Cai, Zhiyuan
    Lin, Li
    He, Huaqing
    Tang, Xiaoying
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VIII, 2022, 13438 : 88 - 98
  • [40] Vision-Language Pre-training with Object Contrastive Learning for 3D Scene Understanding
    Zhang, Taolin
    He, Sunan
    Dai, Tao
    Wang, Zhi
    Chen, Bin
    Xia, Shu-Tao
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7296 - 7304