Multimodal Dynamic Networks for Gesture Recognition

被引:10
|
作者
Wu, Di [1 ]
Shao, Ling [1 ]
机构
[1] Univ Sheffield, Dept Elect & Elect Engn, Sheffield S1 3JD, S Yorkshire, England
关键词
Gesture Recognition; Human-Computer Interaction; Multimodal Fusion; Deep Belief Networks;
D O I
10.1145/2647868.2654969
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Multimodal input is a real-world situation in gesture recognition applications such as sign language recognition. In this paper, we propose a novel bi-modal (audio and skeleton joints) dynamic network for gesture recognition. First, state-of-the-art dynamic Deep Belief Networks are deployed to extract high level audio and skeletal joints representations. Then, instead of traditional late fusion, we adopt another layer of perceptron for cross modality learning taking the input from each individual net's penultimate layer. Finally, to account for temporal dynamics, the learned shared representations are used for estimating the emission probability to infer action sequences. In particular, we demonstrate that multimodal feature learning will extract semantically meaningful shared representations, outperforming individual modalities, and the early fusion scheme's efficacy against the traditional method of late fusion.
引用
收藏
页码:945 / 948
页数:4
相关论文
共 50 条
  • [31] Dynamic Gesture Recognition Based on Deep 3D Natural Networks
    Yun Tie
    Xunlei Zhang
    Jie Chen
    Lin Qi
    Jiessie Tie
    Cognitive Computation, 2023, 15 : 2087 - 2100
  • [32] Multimodal gesture recognition via multiple hypotheses rescoring
    Pitsikalis, Vassilis
    Katsamanis, Athanasios
    Theodorakis, Stavros
    Maragos, Petros
    Journal of Machine Learning Research, 2015, 16 : 255 - 284
  • [33] A Multimodal System for Gesture Recognition in Interactive Music Performance
    Overholt, Dan
    Thompson, John
    Putnam, Lance
    Bell, Bo
    Kleban, Jim
    Sturm, Bob
    Kuchera-Morin, Joann
    COMPUTER MUSIC JOURNAL, 2009, 33 (04) : 69 - 82
  • [34] Multimodal Gesture Recognition via Multiple Hypotheses Rescoring
    Pitsikalis, Vassilis
    Katsamanis, Athanasios
    Theodorakis, Stavros
    Maragos, Petros
    JOURNAL OF MACHINE LEARNING RESEARCH, 2015, 16 : 255 - 284
  • [35] Enhancing multimodal learning through personalized gesture recognition
    Junokas, M. J.
    Lindgren, R.
    Kang, J.
    Morphew, J. W.
    JOURNAL OF COMPUTER ASSISTED LEARNING, 2018, 34 (04) : 350 - 357
  • [36] MGRFormer: A Multimodal Transformer Approach for Surgical Gesture Recognition
    Feghoul, Kevin
    Maia, Deise Santana
    El Amrani, Mehdi
    Daoudi, Mohamed
    Amad, Ali
    2024 IEEE 18TH INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION, FG 2024, 2024,
  • [37] Gesture recognition based on multilevel multimodal feature fusion
    Tian, Jinrong
    Cheng, Wentao
    Sun, Ying
    Li, Gongfa
    Jiang, Du
    Jiang, Guozhang
    Tao, Bo
    Zhao, Haoyi
    Chen, Disi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (03) : 2539 - 2550
  • [38] Dynamic Hand Gesture Recognition Framework
    Premaratne, Prashan
    Yang, Shuai
    Zhou, ZhengMao
    Bandara, Nalin
    INTELLIGENT COMPUTING METHODOLOGIES, 2014, 8589 : 834 - 845
  • [39] A Dynamic Gesture and Posture Recognition System
    Sgouropoulos, Kyriakos
    Stergiopoulou, Ekaterini
    Papamarkos, Nikos
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2014, 76 (02) : 283 - 296
  • [40] Dynamic Gesture Recognition for Social Robots
    Carlos Castillo, Jose
    Caceres-Dominguez, David
    Alonso-Martin, Fernando
    Castro-Gonzalez, Alvaro
    Angel Salichs, Miguel
    SOCIAL ROBOTICS, ICSR 2017, 2017, 10652 : 495 - 505