Masked Autoencoders in 3D Point Cloud Representation Learning

被引:4
|
作者
Jiang, Jincen [1 ]
Lu, Xuequan [2 ]
Zhao, Lizhi [1 ]
Dazeley, Richard [3 ]
Wang, Meili [1 ]
机构
[1] NorthWest A&F Univ, Coll Informat Engn, Yangling 712100, Peoples R China
[2] La Trobe Univ, Dept Comp Sci & IT, Melbourne, Vic 3000, Australia
[3] Deakin Univ, Sch Informat Technol, Geelong, Vic 3216, Australia
关键词
Point cloud compression; Transformers; Task analysis; Feature extraction; Three-dimensional displays; Solid modeling; Decoding; Self-supervised learning; point cloud; completion; NETWORK;
D O I
10.1109/TMM.2023.3314973
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Transformer-based Self-supervised Representation Learning methods learn generic features from unlabeled datasets for providing useful network initialization parameters for downstream tasks. Recently, methods based upon masking Autoencoders have been explored in the fields. The input can be intuitively masked due to regular content, like sequence words and 2D pixels. However, the extension to 3D point cloud is challenging due to irregularity. In this article, we propose masked Autoencoders in 3D point cloud representation learning (abbreviated as MAE3D), a novel autoencoding paradigm for self-supervised learning. We first split the input point cloud into patches and mask a portion of them, then use our Patch Embedding Module to extract the features of unmasked patches. Secondly, we employ patch-wise MAE3D Transformers to learn both local features of point cloud patches and high-level contextual relationships between patches, then complete the latent representations of masked patches. We use our Point Cloud Reconstruction Module with multi-task loss to complete the incomplete point cloud as a result. We conduct self-supervised pre-training on ShapeNet55 with the point cloud completion pre-text task and fine-tune the pre-trained model on ModelNet40 and ScanObjectNN (PB_T50_RS, the hardest variant). Comprehensive experiments demonstrate that the local features extracted by our MAE3D from point cloud patches are beneficial for downstream classification tasks, soundly outperforming state-of-the-art methods (93.4% and 86.2% classification accuracy, respectively).
引用
收藏
页码:820 / 831
页数:12
相关论文
共 50 条
  • [1] Rethinking Masked Representation Learning for 3D Point Cloud Understanding
    Wang, Chuxin
    Zha, Yixin
    He, Jianfeng
    Yang, Wenfei
    Zhang, Tianzhu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 247 - 262
  • [2] Masked Structural Point Cloud Modeling to Learning 3D Representation
    Yamada, Ryosuke
    Tadokoro, Ryu
    Qiu, Yue
    Kataoka, Hirokatsu
    Satoh, Yutaka
    IEEE ACCESS, 2024, 12 : 142291 - 142305
  • [3] PatchMixing Masked Autoencoders for 3D Point Cloud Self-Supervised Learning
    Lin, Chengxing
    Xu, Wenju
    Zhu, Jian
    Nie, Yongwei
    Cai, Ruichu
    Xu, Xuemiao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9882 - 9897
  • [4] T-MAE : Temporal Masked Autoencoders for Point Cloud Representation Learning
    Wei, Weijie
    Nejadasl, Fatemeh Karimi
    Gevers, Theo
    Oswald, Martin R.
    COMPUTER VISION - ECCV 2024, PT XI, 2025, 15069 : 178 - 195
  • [5] PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection
    Chen, Anthony
    Zhang, Kevin
    Zhang, Renrui
    Wang, Zihan
    Lu, Yuheng
    Guo, Yandong
    Zhang, Shanghang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5291 - 5301
  • [6] Feature Visualization for 3D Point Cloud Autoencoders
    Rios, Thiago
    van Stein, Bas
    Menzel, Stefan
    Baeck, Thomas
    Sendhoff, Bernhard
    Wollstadt, Patricia
    2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
  • [7] Joint representation learning for text and 3D point cloud
    Huang, Rui
    Pan, Xuran
    Zheng, Henry
    Jiang, Haojun
    Xie, Zhifeng
    Wu, Cheng
    Song, Shiji
    Huang, Gao
    PATTERN RECOGNITION, 2024, 147
  • [8] Geometric Invariant Representation Learning for 3D Point Cloud
    Li, Zongmin
    Zhang, Yupeng
    Bai, Yun
    2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021), 2021, : 1480 - 1485
  • [9] Masked Autoencoders for Point Cloud Self-supervised Learning
    Pang, Yatian
    Wang, Wenxiao
    Tay, Francis E. H.
    Liu, Wei
    Tian, Yonghong
    Yuan, Li
    COMPUTER VISION - ECCV 2022, PT II, 2022, 13662 : 604 - 621
  • [10] Scalability of Learning Tasks on 3D CAE Models Using Point Cloud Autoencoders
    Rios, Thiago
    Wollstadt, Patricia
    van Stein, Bas
    Baeck, Thomas
    Xu, Zhao
    Sendhoff, Bernhard
    Menzel, Stefan
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 1367 - 1374