MMTS: Multimodal Teacher-Student learning for One-Shot Human Action Recognition

被引:3
|
作者
Lee, Jongwhoa [1 ]
Sim, Minho [1 ]
Choi, Ho-Jin [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Sch Comp, Daejeon, South Korea
关键词
human action recognition; skeleton; keypoints; one-shot; metric learning; teacher-student networks; CNN;
D O I
10.1109/BigComp57234.2023.00045
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Human action recognition (HAR) is applied to many real-world applications, such as visual surveillance, video retrieval, and autonomous driving vehicles. It can utilize various modalities such as RGB, infrared, depth, or skeleton. Among these, we selected and used a skeleton suited to real-time application because it requires less input than RGB data. Furthermore, we focused on a one-shot setting. The skeleton data tends to have a smaller dataset size than other modalities, so it is hard to expect the powerful generalization ability to make representation from unseen data (i.e. novel class). Therefore, to solve this problem, we proposed a skeleton-text multimodal learning method by borrowing a powerful pretrained text encoder that was trained using a large-scale dataset. Our method utilizes the teacher-student approach through the skeleton-text dataset and only uses the skeleton for inferences. The proposed method is more suitable for one-shot skeleton-based HAR compared to the existing multimodal learning method. Our approach outperformed the state-of-the-art methods for the one-shot action recognition protocol on the NTU RGB+D 120 dataset.
引用
收藏
页码:235 / 242
页数:8
相关论文
共 50 条
  • [1] Multimodal Fusion via Teacher-Student Network for Indoor Action Recognition
    Yu, Bruce X. B.
    Liu, Yan
    Chan, Keith C. C.
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 3199 - 3207
  • [2] One-Shot Learning for Real-Time Action Recognition
    Fanello, Sean Ryan
    Gori, Ilaria
    Metta, Giorgio
    Odone, Francesca
    PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2013, 2013, 7887 : 31 - 40
  • [3] SL-DML: Signal Level Deep Metric Learning for Multimodal One-Shot Action Recognition
    Memmesheimer, Raphael
    Theisen, Nick
    Paulus, Dietrich
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4573 - 4580
  • [4] MULTIMODAL ONE-SHOT LEARNING OF SPEECH AND IMAGES
    Eloff, Ryan
    Engelbrecht, Herman A.
    Kamper, Herman
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8623 - 8627
  • [5] FEATURE LEARNING FOR ONE-SHOT FACE RECOGNITION
    Wang, Lingxiao
    Li, Yali
    Wang, Shengjin
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 2386 - 2390
  • [6] HIERARCHICAL TEMPORAL MEMORY ENHANCED ONE-SHOT DISTANCE LEARNING FOR ACTION RECOGNITION
    Zou, Yixiong
    Shi, Yemin
    Wang, Yaowei
    Shu, Yu
    Yuan, Qingsheng
    Tian, Yonghong
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [7] Improved GLOH Approach for One-Shot Learning Human Gesture Recognition
    Karn, Nabin Kumar
    Jiang, Feng
    BIOMETRIC RECOGNITION, 2016, 9967 : 441 - 452
  • [8] One-shot learning based pattern transition map for action early recognition
    Ji, Yanli
    Yang, Yang
    Xu, Xing
    Shen, Heng Tao
    SIGNAL PROCESSING, 2018, 143 : 364 - 370
  • [9] Reflection and Transfer Learning in the One-Shot: Demonstrating Student Learning
    Riesen, Karleigh
    Whitver, Sara Maurice
    COLLEGE & RESEARCH LIBRARIES, 2023, 84 (04): : 531 - 544
  • [10] One-shot action recognition in challenging therapy scenarios
    Sabater, Alberto
    Santos, Laura
    Santos-Victor, Jose
    Bernardino, Alexandre
    Montesano, Luis
    Murillo, Ana C.
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2771 - 2779