Point Cloud Pre-training with Diffusion Models

被引：1

作者：

Zheng, Xiao ^{[1
]}

Huang, Xiaoshui ^{[2
]}

Mei, Guofeng ^{[3
]}

Hou, Yuenan ^{[2
]}

Lyu, Zhaoyang ^{[2
]}

Dai, Bo ^{[2
]}

Ouyang, Wanli ^{[2
]}

Gong, Yongshun ^{[1
]}

机构：

[1] Shandong Univ, Jinan, Peoples R China

[2] Shanghai AI Lab, Shanghai, Peoples R China

[3] Fdn Bruno Kessler, Trento, Italy

来源：

2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2024年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

D O I：

10.1109/CVPR52733.2024.02164

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pre-training a model and then fine-tuning it on down-stream tasks has demonstrated significant success in the 2D image and NLP domains. However, due to the unordered and non-uniform density characteristics of point clouds, it is non-trivial to explore the prior knowledge of point clouds and pre-train a point cloud backbone. In this paper, we propose a novel pre-training method called Point cloud Diffusion pre-training (PointDif). We consider the point cloud pre-training task as a conditional point-to-point generation problem and introduce a conditional point genera-tor. This generator aggregates the features extracted by the backbone and employs them as the condition to guide the point-to-point recovery from the noisy point cloud, thereby assisting the backbone in capturing both local and global geometric priors as well as the global point density distribution of the object. We also present a recurrent uniform sampling optimization strategy, which enables the model to uniformly recover from various noise levels and learn from balanced supervision. Our PointDif achieves substantial improvement across various real-world datasets for diverse downstream tasks such as classification, segmentation and detection. Specifically, PointDif attains 70.0% mIoU on S3DIS Area 5 for the segmentation task and achieves an average improvement of 2.4% on ScanObjectNN for the classification task compared to TAP. Furthermore, our pretraining framework can be flexibly applied to diverse point cloud backbones and bring considerable gains. Code is available at https://github.com/zhengxiaozx/PointDif.

引用

页码：22935 / 22945

页数：11

共 50 条

[1] Unsupervised Point Cloud Pre-training via Occlusion Completion
Wang, Hanchen
Liu, Qi
Yue, Xiangyu
Lasenby, Joan
Kusner, Matt J.
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9762 - 9772
[2] Pre-training with Diffusion Models for Dental Radiography Segmentation
Rousseau, Jeremy
Alaka, Christian
Covili, Emma
Mayard, Hippolyte
Misrachi, Laura
Au, Willy
DEEP GENERATIVE MODELS, DGM4MICCAI 2023, 2024, 14533 : 174 - 182
[3] UNSUPERVISED POINT CLOUD PRE-TRAINING VIA CONTRASTING AND CLUSTERING
Mei, Guofeng
Huang, Xiaoshui
Liu, Juan
Zhang, Jian
Wu, Qiang
2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 66 - 70
[4] Ponder: Point Cloud Pre-training via Neural Rendering
Huang, Di
Peng, Sida
He, Tong
Yang, Honghui
Zhou, Xiaowei
Ouyang, Wanli
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16043 - 16052
[5] Point Cloud Pre-training with Natural 3D Structures
Yamada, Ryosuke
Kataoka, Hirokatsu
Chiba, Naoya
Domae, Yukiyasu
Ogata, Tetsuya
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 21251 - 21261
[6] Image Priors Assisted Pre-training for Point Cloud Shape Analysis
Li, Zhengyu
Wu, Yao
Qu, Yanyun
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT I, 2024, 14425 : 133 - 145
[7] Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models
Wang, Ziyi
Yu, Xumin
Rao, Yongming
Zhou, Jie
Lu, Jiwen
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 5617 - 5627
[8] PointClustering: Unsupervised Point Cloud Pre-training using Transformation Invariance in Clustering
Long, Fuchen
Yao, Ting
Qiu, Zhaofan
Li, Lusong
Mei, Tao
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21824 - 21834
[9] On Effectiveness of Further Pre-training on BERT Models for Story Point Estimation
Amasaki, Sousuke
PROCEEDINGS OF THE 19TH INTERNATIONAL CONFERENCE ON PREDICTIVE MODELS AND DATA ANALYTICS IN SOFTWARE ENGINEERING, PROMISE 2023, 2023, : 49 - 53
[10] Multi-stage Pre-training over Simplified Multimodal Pre-training Models
Liu, Tongtong
Feng, Fangxiang
Wang, Xiaojie
59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING, VOL 1 (ACL-IJCNLP 2021), 2021, : 2556 - 2565

← 1 2 3 4 5 →