Mutual Information Driven Equivariant Contrastive Learning for 3D Action Representation Learning

被引：0

作者：

Lin, Lilang ^{[1
]}

Zhang, Jiahang ^{[1
]}

Liu, Jiaying ^{[1
]}

机构：

[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing 100080, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2024年 / 33卷

基金：

中国国家自然科学基金;

关键词：

Self-supervised learning; Skeleton; Task analysis; Representation learning; Data models; Three-dimensional displays; Convolutional neural networks; skeleton-based action recognition; contrastive learning; LSTM;

D O I：

10.1109/TIP.2024.3372451

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Self-supervised contrastive learning has proven to be successful for skeleton-based action recognition. For contrastive learning, data transformations are found to fundamentally affect the learned representation quality. However, traditional invariant contrastive learning is detrimental to the performance on the downstream task if the transformation carries important information for the task. In this sense, it limits the application of many data transformations in the current contrastive learning pipeline. To address these issues, we propose to utilize equivariant contrastive learning, which extends invariant contrastive learning and preserves important information. By integrating equivariant and invariant contrastive learning into a hybrid approach, the model can better leverage the motion patterns exposed by data transformations and obtain a more discriminative representation space. Specifically, a self-distillation loss is first proposed for transformed data of different intensities to fully utilize invariant transformations, especially strong invariant transformations. For equivariant transformations, we explore the potential of skeleton mixing and temporal shuffling for equivariant contrastive learning. Meanwhile, we analyze the impacts of different data transformations on the feature space in terms of two novel metrics proposed in this paper, namely, consistency and diversity. In particular, we demonstrate that equivariant learning boosts performance by alleviating the dimensional collapse problem. Experimental results on several benchmarks indicate that our method outperforms existing state-of-the-art methods.

引用

下载

页码：1883 / 1897

页数：15

共 50 条

[21] Fair Representation Learning: An Alternative to Mutual Information
Liu, Ji
Li, Zenan
Yao, Yuan
Xu, Feng
Ma, Xiaoxing
Xu, Miao
Tong, Hanghang
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 1088 - 1097
[22] 3D-Mol: A Novel Contrastive Learning Framework for Molecular Property Prediction with 3D Information
Kuang, Taojie
Ren, Yiming
Ren, Zhixiang
PATTERN ANALYSIS AND APPLICATIONS, 2024, 27 (03)
[23] Learning Composite Latent Structures for 3D Human Action Representation and Recognition
Wei, Ping
Sun, Hongbin
Zheng, Nanning
IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (09) : 2195 - 2208
[24] Heterogeneous Features Fusion with Collaborative Representation Learning for 3D Action Recognition
Liang, Chengwu
Chen, Enqing
Qi, Lin
Guan, Ling
2017 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2017, : 162 - 168
[25] Learning an Effective Equivariant 3D Descriptor Without Supervision
Spezialetti, Riccardo
Salti, Samuele
di Stefano, Luigi
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6410 - 6419
[26] Graph contrastive learning with min-max mutual information
Xu, Yuhua
Wang, Junli
Guang, Mingjian
Yan, Chungang
Jiang, Changjun
INFORMATION SCIENCES, 2024, 665
[27] Action-conditioned contrastive learning for 3D human pose and shape estimation in videos
Song, Inpyo
Ryu, Moonwook
Lee, Jangwon
Computer Vision and Image Understanding, 2024, 249
[28] EXTRACTION OF OBJECT REPRESENTATION - THE EFFECTS OF LEARNING-TASKS AND 3D INFORMATION
ANDO, H
SUZUKI, S
INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 1995, 36 (04) : S375 - S375
[29] DCCN: A dual-cross contrastive neural network for 3D point cloud representation learning
Wu, Xiaopeng
Shi, Guangsi
Zhao, Zexing
Li, Mingjie
Gao, Xiaojun
Yan, Xiaoli
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
[30] 3D graph contrastive learning for molecular property prediction
Moon, Kisung
Im, Hyeon-Jin
Kwon, Sunyoung
BIOINFORMATICS, 2023, 39 (06)

← 1 2 3 4 5 →