Masked Autoencoders for Point Cloud Self-supervised Learning

被引:177
|
作者
Pang, Yatian [2 ]
Wang, Wenxiao [3 ]
Tay, Francis E. H. [2 ]
Liu, Wei [4 ]
Tian, Yonghong [1 ,5 ]
Yuan, Li [1 ,5 ]
机构
[1] Peking Univ, Sch Elect & Comp Engn, Beijing, Peoples R China
[2] Natl Univ Singapore, Singapore, Singapore
[3] Zhejiang Univ, Hangzhou, Peoples R China
[4] Tencent Data Platform, Shenzhen, Peoples R China
[5] PengCheng Lab, Shenzhen, Peoples R China
来源
关键词
NETWORK;
D O I
10.1007/978-3-031-20086-1_35
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
As a promising scheme of self-supervised learning, masked autoencoding has significantly advanced natural language processing and computer vision. Inspired by this, we propose a neat scheme of masked autoencoders for point cloud self-supervised learning, addressing the challenges posed by point cloud's properties, including leakage of location information and uneven information density. Concretely, we divide the input point cloud into irregular point patches and randomly mask them at a high ratio. Then, a standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches, aiming to reconstruct the masked point patches. Extensive experiments show that our approach is efficient during pre-training and generalizes well on various downstream tasks. The pre-trained models achieve 85.18% accuracy on ScanObjectNN and 94.04% accuracy on ModelNet40, outperforming all the other self-supervised learning methods. We show with our scheme, a simple architecture entirely based on standard Transformers can surpass dedicated Transformer models from supervised learning. Our approach also advances state-of-the-art accuracies by 1.5%-2.3% in the few-shot classification. Furthermore, our work inspires the feasibility of applying unified architectures from languages and images to the point cloud. Codes are available at https://github.com/Pang-Yatian/Point-MAE.
引用
收藏
页码:604 / 621
页数:18
相关论文
共 50 条
  • [41] Self-Supervised Learning for 3-D Point Clouds Based on a Masked Linear Autoencoder
    Yang, Hongxin
    Wang, Ruisheng
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 11
  • [42] MGM-AE: Self-Supervised Learning on 3D Shape Using Mesh Graph Masked Autoencoders
    Yang, Zhangsihao
    Ding, Kaize
    Liu, Huan
    Wang, Yalin
    2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024, 2024, : 3291 - 3301
  • [43] ProteinMAE: masked autoencoder for protein surface self-supervised learning
    Yuan, Mingzhi
    Shen, Ao
    Fu, Kexue
    Guan, Jiaming
    Ma, Yingfan
    Qiao, Qin
    Wang, Manning
    BIOINFORMATICS, 2023, 39 (12)
  • [44] Masked Motion Encoding for Self-Supervised Video Representation Learning
    Sun, Xinyu
    Chen, Peihao
    Chen, Liangwei
    Li, Changhao
    Li, Thomas H.
    Tan, Mingkui
    Gan, Chuang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2235 - 2245
  • [45] NeRF-MAE: Masked AutoEncoders for Self-supervised 3D Representation Learning for Neural Radiance Fields
    Irshad, Muhammad Zubair
    Zakharov, Sergey
    Guizilini, Vitor
    Gaidon, Adrien
    Kira, Zsolt
    Ambrus, Rares
    COMPUTER VISION - ECCV 2024, PT LXXXVIII, 2025, 15146 : 434 - 453
  • [46] Self-Supervised Point Set Local Descriptors for Point Cloud Registration
    Yuan, Yijun
    Borrmann, Dorit
    Hou, Jiawei
    Ma, Yuexin
    Nuechter, Andreas
    Schwertfeger, Soren
    SENSORS, 2021, 21 (02) : 1 - 18
  • [47] Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders
    Cheng, Jie
    Mei, Xiaodong
    Liu, Ming
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8645 - 8655
  • [48] DHGCN: Dynamic Hop Graph Convolution Network for Self-Supervised Point Cloud Learning
    Jiang, Jincen
    Zhao, Lizhi
    Lu, Xuequan
    Hu, Wei
    Razzak, Imran
    Wang, Meili
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 12883 - 12891
  • [49] SSL-Net: Point-Cloud Generation Network With Self-Supervised Learning
    Sun, Ran
    Gao, Yongbin
    Fang, Zhijun
    Wang, Anjie
    Zhong, Cengsi
    IEEE ACCESS, 2019, 7 : 82206 - 82217
  • [50] Self-supervised Adversarial Masking for 3D Point Cloud Representation Learning
    Szachniewicz, Michal
    Kozlowski, Wojciech
    Stypulkowski, Michal
    Zieba, Maciej
    INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT II, ACIIDS 2024, 2024, 14796 : 156 - 168