Contrastive self-supervised representation learning without negative samples for multimodal human action recognition

被引:1
|
作者
Yang, Huaigang [1 ]
Ren, Ziliang [1 ,2 ]
Yuan, Huaqiang [1 ]
Xu, Zhenyu [2 ]
Zhou, Jun [1 ]
机构
[1] Dongguan Univ Technol, Sch Comp Sci & Technol, Dongguan, Peoples R China
[2] Chinese Acad Sci, Shenzhen Inst Adv Technol, CAS Key Lab Human Machine Intelligence Synergy Sys, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
human action recognition; multimodal representation; feature encoder; contrastive self-supervised learning; Transformer;
D O I
10.3389/fnins.2023.1225312
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Action recognition is an important component of human-computer interaction, and multimodal feature representation and learning methods can be used to improve recognition performance due to the interrelation and complementarity between different modalities. However, due to the lack of large-scale labeled samples, the performance of existing ConvNets-based methods are severely constrained. In this paper, a novel and effective multi-modal feature representation and contrastive self-supervised learning framework is proposed to improve the action recognition performance of models and the generalization ability of application scenarios. The proposed recognition framework employs weight sharing between two branches and does not require negative samples, which could effectively learn useful feature representations by using multimodal unlabeled data, e.g., skeleton sequence and inertial measurement unit signal (IMU). The extensive experiments are conducted on two benchmarks: UTD-MHAD and MMAct, and the results show that our proposed recognition framework outperforms both unimodal and multimodal baselines in action retrieval, semi-supervised learning, and zero-shot learning scenarios.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Bayesian Contrastive Learning with Manifold Regularization for Self-Supervised Skeleton Based Action Recognition
    Lin, Lilang
    Zhang, Jiahang
    Liu, Jiaying
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [32] Global and Local Contrastive Learning for Self-Supervised Skeleton-Based Action Recognition
    Hu, Jinhua
    Hou, Yonghong
    Guo, Zihui
    Gao, Jiajun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (11) : 10578 - 10589
  • [33] Boosting Contrastive Self-Supervised Learning with False Negative Cancellation
    Huynh, Tri
    Kornblith, Simon
    Walter, Matthew R.
    Maire, Michael
    Khademi, Maryam
    2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 986 - 996
  • [34] Transformer-Based Self-Supervised Multimodal Representation Learning for Wearable Emotion Recognition
    Wu, Yujin
    Daoudi, Mohamed
    Amad, Ali
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2024, 15 (01) : 157 - 172
  • [35] Radar Signal Modulation Recognition With Self-Supervised Contrastive Learning
    Li, Shiya
    Du, Xiaolin
    Cui, Guolong
    Chen, Xiaolong
    Zheng, Jibin
    Wan, Xunyang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21
  • [36] Self-supervised contrastive video representation learning for construction equipment activity recognition on limited dataset
    Ghelmani, Ali
    Hammad, Amin
    AUTOMATION IN CONSTRUCTION, 2023, 154
  • [37] Understanding Self-Supervised Learning Dynamics without Contrastive Pairs
    Tian, Yuandong
    Chen, Xinlei
    Ganguli, Surya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7279 - 7289
  • [38] ContIG: Self-supervised Multimodal Contrastive Learning for Medical Imaging with Genetics
    Taleb, Aiham
    Kirchler, Matthias
    Monti, Remo
    Lippert, Christoph
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20876 - 20889
  • [39] Multimodal Visual-Tactile Representation Learning through Self-Supervised Contrastive Pre-Training
    Dave, Vedant
    Lygerakis, Fotios
    Rueckert, Elmar
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 8013 - 8020
  • [40] Self-supervised Representation Learning for Fine Grained Human Hand Action Recognition in Industrial Assembly Lines
    Sturm, Fabian
    Sathiyababu, Rahul
    Allipilli, Harshitha
    Hergenroether, Elke
    Siegel, Melanie
    ADVANCES IN VISUAL COMPUTING, ISVC 2023, PT I, 2023, 14361 : 172 - 184