Masked Autoencoders for Point Cloud Self-supervised Learning

被引：107

作者：

Pang, Yatian ^{[2
]}

Wang, Wenxiao ^{[3
]}

Tay, Francis E. H. ^{[2
]}

Liu, Wei ^{[4
]}

Tian, Yonghong ^{[1
,5
]}

Yuan, Li ^{[1
,5
]}

机构：

[1] Peking Univ, Sch Elect & Comp Engn, Beijing, Peoples R China

[2] Natl Univ Singapore, Singapore, Singapore

[3] Zhejiang Univ, Hangzhou, Peoples R China

[4] Tencent Data Platform, Shenzhen, Peoples R China

[5] PengCheng Lab, Shenzhen, Peoples R China

来源：

COMPUTER VISION - ECCV 2022, PT II | 2022年 / 13662卷

关键词：

NETWORK;

D O I：

10.1007/978-3-031-20086-1_35

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As a promising scheme of self-supervised learning, masked autoencoding has significantly advanced natural language processing and computer vision. Inspired by this, we propose a neat scheme of masked autoencoders for point cloud self-supervised learning, addressing the challenges posed by point cloud's properties, including leakage of location information and uneven information density. Concretely, we divide the input point cloud into irregular point patches and randomly mask them at a high ratio. Then, a standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches, aiming to reconstruct the masked point patches. Extensive experiments show that our approach is efficient during pre-training and generalizes well on various downstream tasks. The pre-trained models achieve 85.18% accuracy on ScanObjectNN and 94.04% accuracy on ModelNet40, outperforming all the other self-supervised learning methods. We show with our scheme, a simple architecture entirely based on standard Transformers can surpass dedicated Transformer models from supervised learning. Our approach also advances state-of-the-art accuracies by 1.5%-2.3% in the few-shot classification. Furthermore, our work inspires the feasibility of applying unified architectures from languages and images to the point cloud. Codes are available at https://github.com/Pang-Yatian/Point-MAE.

引用

页码：604 / 621

页数：18

共 50 条

[1] PatchMixing Masked Autoencoders for 3D Point Cloud Self-Supervised Learning
Lin, Chengxing
Xu, Wenju
Zhu, Jian
Nie, Yongwei
Cai, Ruichu
Xu, Xuemiao
[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (10) : 9882 - 9897
[2] Contrastive Predictive Autoencoders for Dynamic Point Cloud Self-Supervised Learning
Sheng, Xiaoxiao
Shen, Zhiqiang
Xiao, Gang
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9802 - 9810
[3] GraphMAE: Self-Supervised Masked Graph Autoencoders
Hou, Zhenyu
Liu, Xiao
Cen, Yukuo
Dong, Yuxiao
Yang, Hongxia
Wang, Chunjie
Tang, Jie
[J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 594 - 604
[4] Masked Discrimination for Self-supervised Learning on Point Clouds
Liu, Haotian
Cai, Mu
Lee, Yong Jae
[J]. COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 : 657 - 675
[5] A Self-Supervised Learning Approach to Road Anomaly Detection Using Masked Autoencoders
Dutta, Proma
Podder, Kanchon Kanti
Zhang, Jian
Hecht, Christian
Swarna, Surya
Bhavsar, Parth
[J]. INTERNATIONAL CONFERENCE ON TRANSPORTATION AND DEVELOPMENT 2024: PAVEMENTS AND INFRASTRUCTURE SYSTEMS, ICTD 2024, 2024, : 536 - 547
[6] MaeFE: Masked Autoencoders Family of Electrocardiogram for Self-Supervised Pretraining and Transfer Learning
Zhang, Huaicheng
Liu, Wenhan
Shi, Jiguang
Chang, Sheng
Wang, Hao
He, Jin
Huang, Qijun
[J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[7] MaeFE: Masked Autoencoders Family of Electrocardiogram for Self-Supervised Pretraining and Transfer Learning
Zhang, Huaicheng
Liu, Wenhan
Shi, Jiguang
Chang, Sheng
Wang, Hao
He, Jin
Huang, Qijun
[J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[8] Masked Spatio-Temporal Structure Prediction for Self-supervised Learning on Point Cloud Videos
Shen, Zhiqiang
Sheng, Xiaoxiao
Fan, Hehe
Wang, Longguang
Guo, Yulan
Liu, Qiong
Wen, Hao
Zhou, Xi
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16534 - 16543
[9] Self-supervised learning for point cloud data: A survey
Zeng, Changyu
Wang, Wei
Nguyen, Anh
Xiao, Jimin
Yue, Yutao
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[10] Domain Invariant Masked Autoencoders for Self-supervised Learning from Multi-domains
Yang, Haiyang
Tang, Shixiang
Chen, Meilin
Wang, Yizhou
Zhu, Feng
Bai, Lei
Zhao, Rui
Ouyang, Wanli
[J]. COMPUTER VISION, ECCV 2022, PT XXXI, 2022, 13691 : 151 - 168

← 1 2 3 4 5 →