Unsupervised Point Cloud Representation Learning by Clustering and Neural Rendering

被引:0
|
作者
Mei, Guofeng [1 ,3 ]
Saltori, Cristiano [2 ]
Ricci, Elisa [2 ,3 ]
Sebe, Nicu [2 ]
Wu, Qiang [1 ]
Zhang, Jian [1 ]
Poiesi, Fabio [3 ]
机构
[1] Univ Technol Sydney, Fac Engn & IT, Ultimo, NSW 2007, Australia
[2] Univ Trento, Dept Informat Engn & Comp Sci DISI, Via Sommar 9, I-38123 Trento, Italy
[3] Fdn Bruno Kessler, Via Sommar 18, I-38123 Trento, Italy
关键词
Unsupervised learning; Point cloud; Data-augmentation; Clustering; Neural rendering;
D O I
10.1007/s11263-024-02027-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data augmentation has contributed to the rapid advancement of unsupervised learning on 3D point clouds. However, we argue that data augmentation is not ideal, as it requires a careful application-dependent selection of the types of augmentations to be performed, thus potentially biasing the information learned by the network during self-training. Moreover, several unsupervised methods only focus on uni-modal information, thus potentially introducing challenges in the case of sparse and textureless point clouds. To address these issues, we propose an augmentation-free unsupervised approach for point clouds, named CluRender, to learn transferable point-level features by leveraging uni-modal information for soft clustering and cross-modal information for neural rendering. Soft clustering enables self-training through a pseudo-label prediction task, where the affiliation of points to their clusters is used as a proxy under the constraint that these pseudo-labels divide the point cloud into approximate equal partitions. This allows us to formulate a clustering loss to minimize the standard cross-entropy between pseudo and predicted labels. Neural rendering generates photorealistic renderings from various viewpoints to transfer photometric cues from 2D images to the features. The consistency between rendered and real images is then measured to form a fitting loss, combined with the cross-entropy loss to self-train networks. Experiments on downstream applications, including 3D object detection, semantic segmentation, classification, part segmentation, and few-shot learning, demonstrate the effectiveness of our framework in outperforming state-of-the-art techniques.
引用
收藏
页码:3251 / 3269
页数:19
相关论文
共 50 条
  • [21] MPR-GAN: A Novel Neural Rendering Framework for MLS Point Cloud With Deep Generative Learning
    Xu, Qingyang
    Guan, Xuefeng
    Cao, Jun
    Ma, Yanli
    Wu, Huayi
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [22] Unsupervised Representation Learning via Neural Activation Coding
    Park, Yookoon
    Lee, Sangho
    Kim, Gunhee
    Blei, David M.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [23] Ponder: Point Cloud Pre-training via Neural Rendering
    Huang, Di
    Peng, Sida
    He, Tong
    Yang, Honghui
    Zhou, Xiaowei
    Ouyang, Wanli
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16043 - 16052
  • [24] Point Cloud Rendering in FPGA
    Zemcik, Pavel
    Marsik, Lukas
    Herout, Adam
    WSCG 2009, COMMUNICATION PAPERS PROCEEDINGS, 2009, : 63 - 66
  • [25] Unsupervised Neural Rendering for Image Hazing
    Li, Boyun
    Lin, Yijie
    Bai, Jinfeng
    Hu, Peng
    Lv, Jiancheng
    Peng, Xi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 3987 - 3996
  • [26] Neural Points: Point Cloud Representation with Neural Fields for Arbitrary Upsampling
    Feng, Wanquan
    Li, Jin
    Cai, Hongrui
    Luo, Xiaonan
    Zhang, Juyong
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 18612 - 18621
  • [27] Learning a Structured Latent Space for Unsupervised Point Cloud Completion
    Cai, Yingjie
    Lin, Kwan-Yee
    Zhang, Chao
    Wang, Qiang
    Wang, Xiaogang
    Li, Hongsheng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5533 - 5543
  • [28] RotPredictor: Unsupervised Canonical Viewpoint Learning for Point Cloud Classification
    Fang, Jin
    Zhou, Dingfu
    Song, Xibin
    Jin, Shengze
    Yang, Ruigang
    Zhang, Liangjun
    2020 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2020), 2020, : 987 - 996
  • [29] MEJIGCLU: MORE EFFECTIVE JIGSAW CLUSTERING FOR UNSUPERVISED VISUAL REPRESENTATION LEARNING
    Zhang, Yongsheng
    Liu, Qing
    Zhao, Yang
    Liang, Yixiong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2135 - 2139
  • [30] Robust multilayer bootstrap networks in ensemble for unsupervised representation learning and clustering
    Zhang, Xiao-Lei
    Li, Xuelong
    PATTERN RECOGNITION, 2024, 156