Masked Autoencoders for Point Cloud Self-supervised Learning

被引：177

作者：

Pang, Yatian ^{[2
]}

Wang, Wenxiao ^{[3
]}

Tay, Francis E. H. ^{[2
]}

Liu, Wei ^{[4
]}

Tian, Yonghong ^{[1
,5
]}

Yuan, Li ^{[1
,5
]}

机构：

[1] Peking Univ, Sch Elect & Comp Engn, Beijing, Peoples R China

[2] Natl Univ Singapore, Singapore, Singapore

[3] Zhejiang Univ, Hangzhou, Peoples R China

[4] Tencent Data Platform, Shenzhen, Peoples R China

[5] PengCheng Lab, Shenzhen, Peoples R China

来源：

COMPUTER VISION - ECCV 2022, PT II | 2022年 / 13662卷

关键词：

NETWORK;

D O I：

10.1007/978-3-031-20086-1_35

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

As a promising scheme of self-supervised learning, masked autoencoding has significantly advanced natural language processing and computer vision. Inspired by this, we propose a neat scheme of masked autoencoders for point cloud self-supervised learning, addressing the challenges posed by point cloud's properties, including leakage of location information and uneven information density. Concretely, we divide the input point cloud into irregular point patches and randomly mask them at a high ratio. Then, a standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches, aiming to reconstruct the masked point patches. Extensive experiments show that our approach is efficient during pre-training and generalizes well on various downstream tasks. The pre-trained models achieve 85.18% accuracy on ScanObjectNN and 94.04% accuracy on ModelNet40, outperforming all the other self-supervised learning methods. We show with our scheme, a simple architecture entirely based on standard Transformers can surpass dedicated Transformer models from supervised learning. Our approach also advances state-of-the-art accuracies by 1.5%-2.3% in the few-shot classification. Furthermore, our work inspires the feasibility of applying unified architectures from languages and images to the point cloud. Codes are available at https://github.com/Pang-Yatian/Point-MAE.

引用

页码：604 / 621

页数：18

共 50 条

[41] Self-Supervised Learning for 3-D Point Clouds Based on a Masked Linear Autoencoder
Yang, Hongxin
Wang, Ruisheng
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 11
[42] MGM-AE: Self-Supervised Learning on 3D Shape Using Mesh Graph Masked Autoencoders
Yang, Zhangsihao
Ding, Kaize
Liu, Huan
Wang, Yalin
2024 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION, WACV 2024, 2024, : 3291 - 3301
[43] ProteinMAE: masked autoencoder for protein surface self-supervised learning
Yuan, Mingzhi
Shen, Ao
Fu, Kexue
Guan, Jiaming
Ma, Yingfan
Qiao, Qin
Wang, Manning
BIOINFORMATICS, 2023, 39 (12)
[44] Masked Motion Encoding for Self-Supervised Video Representation Learning
Sun, Xinyu
Chen, Peihao
Chen, Liangwei
Li, Changhao
Li, Thomas H.
Tan, Mingkui
Gan, Chuang
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2235 - 2245
[45] NeRF-MAE: Masked AutoEncoders for Self-supervised 3D Representation Learning for Neural Radiance Fields
Irshad, Muhammad Zubair
Zakharov, Sergey
Guizilini, Vitor
Gaidon, Adrien
Kira, Zsolt
Ambrus, Rares
COMPUTER VISION - ECCV 2024, PT LXXXVIII, 2025, 15146 : 434 - 453
[46] Self-Supervised Point Set Local Descriptors for Point Cloud Registration
Yuan, Yijun
Borrmann, Dorit
Hou, Jiawei
Ma, Yuexin
Nuechter, Andreas
Schwertfeger, Soren
SENSORS, 2021, 21 (02) : 1 - 18
[47] Forecast-MAE: Self-supervised Pre-training for Motion Forecasting with Masked Autoencoders
Cheng, Jie
Mei, Xiaodong
Liu, Ming
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8645 - 8655
[48] DHGCN: Dynamic Hop Graph Convolution Network for Self-Supervised Point Cloud Learning
Jiang, Jincen
Zhao, Lizhi
Lu, Xuequan
Hu, Wei
Razzak, Imran
Wang, Meili
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 12883 - 12891
[49] SSL-Net: Point-Cloud Generation Network With Self-Supervised Learning
Sun, Ran
Gao, Yongbin
Fang, Zhijun
Wang, Anjie
Zhong, Cengsi
IEEE ACCESS, 2019, 7 : 82206 - 82217
[50] Self-supervised Adversarial Masking for 3D Point Cloud Representation Learning
Szachniewicz, Michal
Kozlowski, Wojciech
Stypulkowski, Michal
Zieba, Maciej
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, PT II, ACIIDS 2024, 2024, 14796 : 156 - 168

← 1 2 3 4 5 →