Point Cloud Pre-training with Diffusion Models

被引:1
|
作者
Zheng, Xiao [1 ]
Huang, Xiaoshui [2 ]
Mei, Guofeng [3 ]
Hou, Yuenan [2 ]
Lyu, Zhaoyang [2 ]
Dai, Bo [2 ]
Ouyang, Wanli [2 ]
Gong, Yongshun [1 ]
机构
[1] Shandong Univ, Jinan, Peoples R China
[2] Shanghai AI Lab, Shanghai, Peoples R China
[3] Fdn Bruno Kessler, Trento, Italy
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
10.1109/CVPR52733.2024.02164
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-training a model and then fine-tuning it on down-stream tasks has demonstrated significant success in the 2D image and NLP domains. However, due to the unordered and non-uniform density characteristics of point clouds, it is non-trivial to explore the prior knowledge of point clouds and pre-train a point cloud backbone. In this paper, we propose a novel pre-training method called Point cloud Diffusion pre-training (PointDif). We consider the point cloud pre-training task as a conditional point-to-point generation problem and introduce a conditional point genera-tor. This generator aggregates the features extracted by the backbone and employs them as the condition to guide the point-to-point recovery from the noisy point cloud, thereby assisting the backbone in capturing both local and global geometric priors as well as the global point density distribution of the object. We also present a recurrent uniform sampling optimization strategy, which enables the model to uniformly recover from various noise levels and learn from balanced supervision. Our PointDif achieves substantial improvement across various real-world datasets for diverse downstream tasks such as classification, segmentation and detection. Specifically, PointDif attains 70.0% mIoU on S3DIS Area 5 for the segmentation task and achieves an average improvement of 2.4% on ScanObjectNN for the classification task compared to TAP. Furthermore, our pretraining framework can be flexibly applied to diverse point cloud backbones and bring considerable gains. Code is available at https://github.com/zhengxiaozx/PointDif.
引用
收藏
页码:22935 / 22945
页数:11
相关论文
共 50 条
  • [1] Unsupervised Point Cloud Pre-training via Occlusion Completion
    Wang, Hanchen
    Liu, Qi
    Yue, Xiangyu
    Lasenby, Joan
    Kusner, Matt J.
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9762 - 9772
  • [2] Pre-training with Diffusion Models for Dental Radiography Segmentation
    Rousseau, Jeremy
    Alaka, Christian
    Covili, Emma
    Mayard, Hippolyte
    Misrachi, Laura
    Au, Willy
    DEEP GENERATIVE MODELS, DGM4MICCAI 2023, 2024, 14533 : 174 - 182
  • [3] UNSUPERVISED POINT CLOUD PRE-TRAINING VIA CONTRASTING AND CLUSTERING
    Mei, Guofeng
    Huang, Xiaoshui
    Liu, Juan
    Zhang, Jian
    Wu, Qiang
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 66 - 70
  • [4] Ponder: Point Cloud Pre-training via Neural Rendering
    Huang, Di
    Peng, Sida
    He, Tong
    Yang, Honghui
    Zhou, Xiaowei
    Ouyang, Wanli
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16043 - 16052
  • [5] Point Cloud Pre-training with Natural 3D Structures
    Yamada, Ryosuke
    Kataoka, Hirokatsu
    Chiba, Naoya
    Domae, Yukiyasu
    Ogata, Tetsuya
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 21251 - 21261
  • [6] Image Priors Assisted Pre-training for Point Cloud Shape Analysis
    Li, Zhengyu
    Wu, Yao
    Qu, Yanyun
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 133 - 145
  • [7] Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models
    Wang, Ziyi
    Yu, Xumin
    Rao, Yongming
    Zhou, Jie
    Lu, Jiwen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5617 - 5627
  • [8] PointClustering: Unsupervised Point Cloud Pre-training using Transformation Invariance in Clustering
    Long, Fuchen
    Yao, Ting
    Qiu, Zhaofan
    Li, Lusong
    Mei, Tao
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21824 - 21834
  • [9] On Effectiveness of Further Pre-training on BERT Models for Story Point Estimation
    Amasaki, Sousuke
    PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON PREDICTIVE MODELS AND DATA ANALYTICS IN SOFTWARE ENGINEERING, PROMISE 2023, 2023, : 49 - 53
  • [10] Multi-stage Pre-training over Simplified Multimodal Pre-training Models
    Liu, Tongtong
    Feng, Fangxiang
    Wang, Xiaojie
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2556 - 2565