Unsupervised 3D Point Cloud Representation Learning by Triangle Constrained Contrast for Autonomous Driving

被引:4
|
作者
Pang, Bo [1 ]
Xia, Hongchi [1 ]
Lu, Cewu [1 ,2 ,3 ,4 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Shanghai Jiao Tong Univ, Qing Yuan Res Inst, Shanghai, Peoples R China
[3] Shanghai Jiao Tong Univ, MoE Key Lab Artificial Intelligence, AI Inst, Shanghai, Peoples R China
[4] Shanghai Qi Zhi Inst, Shanghai, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
关键词
D O I
10.1109/CVPR52729.2023.00506
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the difficulty of annotating the 3D LiDAR data of autonomous driving, an efficient unsupervised 3D representation learning method is important. In this paper, we design the Triangle Constrained Contrast (TriCC) framework tailored for autonomous driving scenes which learns 3D unsupervised representations through both the multimodal information and dynamic of temporal sequences. We treat one camera image and two LiDAR point clouds with different timestamps as a triplet. And our key design is the consistent constraint that automatically finds matching relationships among the triplet through "self-cycle" and learns representations from it. With the matching relations across the temporal dimension and modalities, we can further conduct a triplet contrast to improve learning efficiency. To the best of our knowledge, TriCC is the first framework that unifies both the temporal and multimodal semantics, which means it utilizes almost all the information in autonomous driving scenes. And compared with previous contrastive methods, it can automatically dig out contrasting pairs with higher difficulty, instead of relying on handcrafted ones. Extensive experiments are conducted with Minkowski-UNet and VoxelNet on several semantic segmentation and 3D detection datasets. Results show that TriCC learns effective representations with much fewer training iterations and improves the SOTA results greatly on all the downstream tasks. Code and models can be found at https://bopang1996.github.io/.
引用
收藏
页码:5229 / 5239
页数:11
相关论文
共 50 条
  • [11] PLOT: a 3D point cloud object detection network for autonomous driving
    Zhang, Yihuan
    Wang, Liang
    Dai, Yifan
    ROBOTICA, 2023, 41 (05) : 1483 - 1499
  • [12] Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning
    Wu, Xiaoyang
    Wen, Xin
    Liu, Xihui
    Zhao, Hengshuang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 9415 - 9424
  • [13] A Survey of 3D Point Cloud and Deep Learning-Based Approaches for Scene Understanding in Autonomous Driving
    Wang, Lele
    Huang, Yingping
    IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE, 2022, 14 (06) : 135 - 154
  • [14] Rethinking Masked Representation Learning for 3D Point Cloud Understanding
    Wang, Chuxin
    Zha, Yixin
    He, Jianfeng
    Yang, Wenfei
    Zhang, Tianzhu
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2025, 34 : 247 - 262
  • [15] Clustering based Point Cloud Representation Learning for 3D Analysis
    Feng, Tuo
    Wang, Wenguan
    Wang, Xiaohan
    Yang, Yi
    Zheng, Qinghua
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8249 - 8260
  • [16] Feature extraction and representation learning of 3D point cloud data
    Si, Hongying
    Wei, Xianyong
    IMAGE AND VISION COMPUTING, 2024, 142
  • [17] Masked Structural Point Cloud Modeling to Learning 3D Representation
    Yamada, Ryosuke
    Tadokoro, Ryu
    Qiu, Yue
    Kataoka, Hirokatsu
    Satoh, Yutaka
    IEEE ACCESS, 2024, 12 : 142291 - 142305
  • [18] Unsupervised contrastive learning with simple transformation for 3D point cloud data
    Jiang, Jincen
    Lu, Xuequan
    Ouyang, Wanli
    Wang, Meili
    VISUAL COMPUTER, 2024, 40 (08): : 5169 - 5186
  • [19] Real-Time Semantic Segmentation of 3D Point Cloud for Autonomous Driving
    Kang, Dongwan
    Wong, Anthony
    Lee, Banghyon
    Kim, Jungha
    ELECTRONICS, 2021, 10 (16)
  • [20] Temporal Consistent 3D LiDAR Representation Learning for Semantic Perception in Autonomous Driving
    Nunes, Lucas
    Wiesmann, Louis
    Marcuzzi, Rodrigo
    Chen, Xieyuanli
    Behley, Jens
    Stachniss, Cyrill
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 5217 - 5228