Representation modeling learning with multi-domain decoupling for unsupervised skeleton-based action recognition

被引:0
|
作者
He, Zhiquan [1 ,2 ]
Lv, Jiantu [2 ]
Fang, Shizhang [2 ]
机构
[1] Guangdong Key Lab Intelligent Informat Proc, Shenzhen, Peoples R China
[2] Shenzhen Univ, Guangdong Multimedia Informat Serv Engn Technol Re, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
Unsupervised learning; Contrastive learning; Action recognition;
D O I
10.1016/j.neucom.2024.127495
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Skeleton-based action recognition is one of the basic researches in computer vision. In recent years, the unsupervised contrastive learning paradigm has achieved great success in skeleton-based action recognition. However, previous work often treated input skeleton sequences as a whole when performing comparisons, lacking fine-grained representation contrast learning. Therefore, we propose a contrastive learning method for Representation Modeling with Multi-domain D ecoupling (RMMD), which extracts the most significant representations from input skeleton sequences in the temporal domain, spatial domain and frequency domain, respectively. Specifically, in the temporal and spatial domains, we propose a multi-level spatiotemporal mining reconstruction module (STMR) that iteratively reconstructs the original input skeleton sequences to highlight spatiotemporal representations under different actions. At the same time, we introduce position encoding and a global adaptive attention matrix, balancing both global and local information, and effectively modeling the spatiotemporal dependencies between joints. In the frequency domain, we use the discrete cosine transform (DCT) to achieve temporal-frequency conversion, discard part of the interference information, and use the frequency self-attention (FSA) and multi-level aggregation perceptron (MLAP) to deeply explore the frequency domain representation. The fusion of the temporal domain, spatial domain and frequency domain representations makes our model more discriminative in representing different actions. Besides, we verify the effectiveness of the model on the NTU RGB+D and PKU-MMD datasets. Extensive experiments show that our method outperforms existing unsupervised methods and achieves significant performance improvements in downstream tasks such as action recognition and action retrieval.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] A Cross View Learning Approach for Skeleton-Based Action Recognition
    Zheng, Hui
    Zhang, Xinming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 3061 - 3072
  • [32] Deep Learning Techniques for Skeleton-Based Action Recognition: A Survey
    Pham, Dinh-Tan
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS-ICCSA 2024, PT II, 2024, 14814 : 427 - 435
  • [33] Deep Learning on Lie Groups for Skeleton-based Action Recognition
    Huang, Zhiwu
    Wan, Chengde
    Probst, Thomas
    Van Gool, Luc
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1243 - 1252
  • [34] A Short Survey on Deep Learning for Skeleton-based Action Recognition
    Wang, Wei
    Zhang, Yu-Dong
    COMPANION PROCEEDINGS OF THE 14TH IEEE/ACM INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC'21 COMPANION), 2021,
  • [35] SkelResNet: Transfer Learning Approach for Skeleton-Based Action Recognition
    Kilic, Ugur
    Karadag, Ozge Oztimur
    Ozyer, Gulsah Tumuklu
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [36] JointContrast: Skeleton-Based Mutual Action Recognition with Contrastive Learning
    Jia, Xiangze
    Zhang, Ji
    Wang, Zhen
    Luo, Yonglong
    Chen, Fulong
    Xiao, Jing
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 478 - 489
  • [37] Revisiting Skeleton-based Action Recognition
    Duan, Haodong
    Zhao, Yue
    Chen, Kai
    Lin, Dahua
    Dai, Bo
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 2959 - 2968
  • [38] Skeleton-Based Action Recognition Using Graph Convolution and Cross-Domain Transfer Learning
    2024 NATIONAL CONFERENCE ON COMMUNICATIONS, NCC, 2024,
  • [39] Skeleton-based Action Recognition Based on Deep Learning and Grassmannian Pyramids
    Konstantinidis, Dimitrios
    Dimitropoulos, Kosmas
    Daras, Petros
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2045 - 2049
  • [40] JointContrast: Skeleton-Based Interaction Recognition with New Representation and Contrastive Learning
    Zhang, Ji
    Jia, Xiangze
    Wang, Zhen
    Luo, Yonglong
    Chen, Fulong
    Yang, Gaoming
    Zhao, Lihui
    ALGORITHMS, 2023, 16 (04)