Masked Autoencoder for Self-Supervised Pre-training on Lidar Point Clouds

被引:16
|
作者
Hess, Georg [1 ,2 ]
Jaxing, Johan [1 ]
Svensson, Elias [1 ]
Hagerman, David [1 ]
Petersson, Christoffer [1 ,2 ]
Svensson, Lennart [1 ]
机构
[1] Chalmers Univ Technol, Gothenburg, Sweden
[2] Zenseact, Gothenburg, Sweden
基金
瑞典研究理事会;
关键词
D O I
10.1109/WACVW58289.2023.00039
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Masked autoencoding has become a successful pretraining paradigm for Transformer models for text, images, and, recently, point clouds. Raw automotive datasets are suitable candidates for self-supervised pre-training as they generally are cheap to collect compared to annotations for tasks like 3D object detection (OD). However, the development of masked autoencoders for point clouds has focused solely on synthetic and indoor data. Consequently, existing methods have tailored their representations and models toward small and dense point clouds with homogeneous point densities. In this work, we study masked autoencoding for point clouds in an automotive setting, which are sparse and for which the point density can vary drastically among objects in the same scene. To this end, we propose VoxelMAE, a simple masked autoencoding pre-training scheme designed for voxel representations. We pre-train the backbone of a Transformer-based 3D object detector to reconstruct masked voxels and to distinguish between empty and non-empty voxels. Our method improves the 3D OD performance by 1.75 mAP points and 1.05 NDS on the challenging nuScenes dataset. Further, we show that by pre-training with Voxel-MAE, we require only 40% of the annotated data to outperform a randomly initialized equivalent. Code is available at https://github.com/georghess/ voxel-mae.
引用
收藏
页码:350 / 359
页数:10
相关论文
共 50 条
  • [41] FALL DETECTION USING SELF-SUPERVISED PRE-TRAINING MODEL
    Yhdego, Haben
    Audette, Michel
    Paolini, Christopher
    PROCEEDINGS OF THE 2022 ANNUAL MODELING AND SIMULATION CONFERENCE (ANNSIM'22), 2022, : 361 - 371
  • [42] SPAKT: A Self-Supervised Pre-TrAining Method for Knowledge Tracing
    Ma, Yuling
    Han, Peng
    Qiao, Huiyan
    Cui, Chaoran
    Yin, Yilong
    Yu, Dehu
    IEEE ACCESS, 2022, 10 : 72145 - 72154
  • [43] MEASURING THE IMPACT OF DOMAIN FACTORS IN SELF-SUPERVISED PRE-TRAINING
    Sanabria, Ramon
    Wei-Ning, Hsu
    Alexei, Baevski
    Auli, Michael
    2023 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW, 2023,
  • [44] Correlational Image Modeling for Self-Supervised Visual Pre-Training
    Li, Wei
    Xie, Jiahao
    Loy, Chen Change
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15105 - 15115
  • [45] Contrastive Self-Supervised Pre-Training for Video Quality Assessment
    Chen, Pengfei
    Li, Leida
    Wu, Jinjian
    Dong, Weisheng
    Shi, Guangming
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 458 - 471
  • [46] PointVST: Self-Supervised Pre-Training for 3D Point Clouds via View-Specific Point-to-Image Translation
    Zhang, Qijian
    Hou, Junhui
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (10) : 6900 - 6912
  • [47] PRED: Pre-training via Semantic Rendering on LiDAR Point Clouds
    Yang, Hao
    Wang, Haiyang
    Dai, Di
    Wang, Liwei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [48] LPRnet: A Self-Supervised Registration Network for LiDAR and Photogrammetric Point Clouds
    Wang, Chen
    Gu, Yanfeng
    Li, Xian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [49] A Survey on Masked Autoencoder for Visual Self-supervised Learning
    Zhang, Chaoning
    Zhang, Chenshuang
    Song, Junha
    Yi, John Seon Keun
    Kweon, In So
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6805 - 6813
  • [50] Contrastive Learning for Self-Supervised Pre-Training of Point Cloud Segmentation Networks With Image Data
    Janda, Andrej
    Wagstaff, Brandon
    Ng, Edwin G.
    Kelly, Jonathan
    2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 145 - 152