Masked Autoencoders for Point Cloud Self-supervised Learning

被引:107
|
作者
Pang, Yatian [2 ]
Wang, Wenxiao [3 ]
Tay, Francis E. H. [2 ]
Liu, Wei [4 ]
Tian, Yonghong [1 ,5 ]
Yuan, Li [1 ,5 ]
机构
[1] Peking Univ, Sch Elect & Comp Engn, Beijing, Peoples R China
[2] Natl Univ Singapore, Singapore, Singapore
[3] Zhejiang Univ, Hangzhou, Peoples R China
[4] Tencent Data Platform, Shenzhen, Peoples R China
[5] PengCheng Lab, Shenzhen, Peoples R China
来源
关键词
NETWORK;
D O I
10.1007/978-3-031-20086-1_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a promising scheme of self-supervised learning, masked autoencoding has significantly advanced natural language processing and computer vision. Inspired by this, we propose a neat scheme of masked autoencoders for point cloud self-supervised learning, addressing the challenges posed by point cloud's properties, including leakage of location information and uneven information density. Concretely, we divide the input point cloud into irregular point patches and randomly mask them at a high ratio. Then, a standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches, aiming to reconstruct the masked point patches. Extensive experiments show that our approach is efficient during pre-training and generalizes well on various downstream tasks. The pre-trained models achieve 85.18% accuracy on ScanObjectNN and 94.04% accuracy on ModelNet40, outperforming all the other self-supervised learning methods. We show with our scheme, a simple architecture entirely based on standard Transformers can surpass dedicated Transformer models from supervised learning. Our approach also advances state-of-the-art accuracies by 1.5%-2.3% in the few-shot classification. Furthermore, our work inspires the feasibility of applying unified architectures from languages and images to the point cloud. Codes are available at https://github.com/Pang-Yatian/Point-MAE.
引用
收藏
页码:604 / 621
页数:18
相关论文
共 50 条
  • [1] PatchMixing Masked Autoencoders for 3D Point Cloud Self-Supervised Learning
    Lin, Chengxing
    Xu, Wenju
    Zhu, Jian
    Nie, Yongwei
    Cai, Ruichu
    Xu, Xuemiao
    [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (10) : 9882 - 9897
  • [2] Contrastive Predictive Autoencoders for Dynamic Point Cloud Self-Supervised Learning
    Sheng, Xiaoxiao
    Shen, Zhiqiang
    Xiao, Gang
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9802 - 9810
  • [3] GraphMAE: Self-Supervised Masked Graph Autoencoders
    Hou, Zhenyu
    Liu, Xiao
    Cen, Yukuo
    Dong, Yuxiao
    Yang, Hongxia
    Wang, Chunjie
    Tang, Jie
    [J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 594 - 604
  • [4] Masked Discrimination for Self-supervised Learning on Point Clouds
    Liu, Haotian
    Cai, Mu
    Lee, Yong Jae
    [J]. COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 : 657 - 675
  • [5] A Self-Supervised Learning Approach to Road Anomaly Detection Using Masked Autoencoders
    Dutta, Proma
    Podder, Kanchon Kanti
    Zhang, Jian
    Hecht, Christian
    Swarna, Surya
    Bhavsar, Parth
    [J]. INTERNATIONAL CONFERENCE ON TRANSPORTATION AND DEVELOPMENT 2024: PAVEMENTS AND INFRASTRUCTURE SYSTEMS, ICTD 2024, 2024, : 536 - 547
  • [6] MaeFE: Masked Autoencoders Family of Electrocardiogram for Self-Supervised Pretraining and Transfer Learning
    Zhang, Huaicheng
    Liu, Wenhan
    Shi, Jiguang
    Chang, Sheng
    Wang, Hao
    He, Jin
    Huang, Qijun
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [7] MaeFE: Masked Autoencoders Family of Electrocardiogram for Self-Supervised Pretraining and Transfer Learning
    Zhang, Huaicheng
    Liu, Wenhan
    Shi, Jiguang
    Chang, Sheng
    Wang, Hao
    He, Jin
    Huang, Qijun
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [8] Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos
    Shen, Zhiqiang
    Sheng, Xiaoxiao
    Fan, Hehe
    Wang, Longguang
    Guo, Yulan
    Liu, Qiong
    Wen, Hao
    Zhou, Xi
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16534 - 16543
  • [9] Self-supervised learning for point cloud data: A survey
    Zeng, Changyu
    Wang, Wei
    Nguyen, Anh
    Xiao, Jimin
    Yue, Yutao
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [10] Domain Invariant Masked Autoencoders for Self-supervised Learning from Multi-domains
    Yang, Haiyang
    Tang, Shixiang
    Chen, Meilin
    Wang, Yizhou
    Zhu, Feng
    Bai, Lei
    Zhao, Rui
    Ouyang, Wanli
    [J]. COMPUTER VISION, ECCV 2022, PT XXXI, 2022, 13691 : 151 - 168