Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds

被引:67
|
作者
Huang, Siyuan [1 ]
Degrees, Yichen Xie [2 ]
Zhu, Song-Chun [3 ,4 ,5 ]
Zhu, Yixin [3 ,4 ]
机构
[1] Univ Calif Los Angeles, Los Angeles, CA 90095 USA
[2] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[3] Beijing Inst Gen Artificial Intelligence, Beijing, Peoples R China
[4] Peking Univ, Beijing, Peoples R China
[5] Tsinghua Univ, Beijing, Peoples R China
关键词
D O I
10.1109/ICCV48922.2021.00647
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To date, various 3D scene understanding tasks still lack practical and generalizable pre-trained models, primarily due to the intricate nature of 3D scene understanding tasks and their immense variations introduced by camera views, lighting, occlusions, etc. In this paper, we tackle this challenge by introducing a spatio-temporal representation learning (STRL) framework, capable of learning from unlabeled 3D point clouds in a self-supervised fashion. Inspired by how infants learn from visual data in the wild, we explore the rich spatio-temporal cues derived from the 3D data. Specifically, STRL takes two temporally-correlated frames from a 3D point cloud sequence as the input, transforms it with the spatial data augmentation, and learns the invariant representation self-supervisedly. To corroborate the efficacy of STRL, we conduct extensive experiments on three types (synthetic, indoor, and outdoor) of datasets. Experimental results demonstrate that, compared with supervised learning methods, the learned self-supervised representation facilitates various models to attain comparable or even better performances while capable of generalizing pre-trained models to downstream tasks, including 3D shape classification, 3D object detection, and 3D semantic segmentation. Moreover, the spatio-temporal contextual cues embedded in 3D point clouds significantly improve the learned representations.
引用
收藏
页码:6515 / 6525
页数:11
相关论文
共 50 条
  • [1] Self-Supervised Video Representation Learning by Uncovering Spatio-Temporal Statistics
    Wang, Jiangliu
    Jiao, Jianbo
    Bao, Linchao
    He, Shengfeng
    Liu, Wei
    Liu, Yun-hui
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (07) : 3791 - 3806
  • [2] Contrastive Spatio-Temporal Pretext Learning for Self-Supervised Video Representation
    Zhang, Yujia
    Po, Lai-Man
    Xu, Xuyuan
    Liu, Mengyang
    Wang, Yexin
    Ou, Weifeng
    Zhao, Yuzhi
    Yu, Wing-Yin
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3380 - 3389
  • [3] Self-Supervised Learning of Local Features in 3D Point Clouds
    Thabet, Ali
    Alwassel, Humam
    Ghanem, Bernard
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4048 - 4052
  • [4] Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning
    Yao, Yuan
    Liu, Chang
    Luo, Dezhao
    Zhou, Yu
    Ye, Qixiang
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 6547 - 6556
  • [5] SELF-SUPERVISED SPATIO-TEMPORAL REPRESENTATION LEARNING OF SATELLITE IMAGE TIME SERIES
    Dumeur, Iris
    Valero, Silvia
    Inglada, Jordi
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 642 - 645
  • [6] Joint spatio-temporal features constrained self-supervised electrocardiogram representation learning
    Ran, Ao
    Liu, Huafeng
    [J]. BIOMEDICAL ENGINEERING LETTERS, 2024, 14 (02) : 209 - 220
  • [7] Self-Supervised Spatio-Temporal Representation Learning of Satellite Image Time Series
    Dumeur, Iris
    Valero, Silvia
    Inglada, Jordi
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 4350 - 4367
  • [8] Joint spatio-temporal features constrained self-supervised electrocardiogram representation learning
    Ao Ran
    Huafeng Liu
    [J]. Biomedical Engineering Letters, 2024, 14 : 209 - 220
  • [9] Self-Supervised Spatio-Temporal Graph Learning for Point-of-Interest Recommendation
    Liu, Jiawei
    Gao, Haihan
    Shi, Chuan
    Cheng, Hongtao
    Xie, Qianlong
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (15):
  • [10] Self-Supervised Learning on 3D Point Clouds by Learning Discrete Generative Models
    Eckart, Benjamin
    Yuan, Wentao
    Liu, Chao
    Kautz, Jan
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8244 - 8253