Multimodal Dynamic Networks for Gesture Recognition

被引:10
|
作者
Wu, Di [1 ]
Shao, Ling [1 ]
机构
[1] Univ Sheffield, Dept Elect & Elect Engn, Sheffield S1 3JD, S Yorkshire, England
关键词
Gesture Recognition; Human-Computer Interaction; Multimodal Fusion; Deep Belief Networks;
D O I
10.1145/2647868.2654969
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Multimodal input is a real-world situation in gesture recognition applications such as sign language recognition. In this paper, we propose a novel bi-modal (audio and skeleton joints) dynamic network for gesture recognition. First, state-of-the-art dynamic Deep Belief Networks are deployed to extract high level audio and skeletal joints representations. Then, instead of traditional late fusion, we adopt another layer of perceptron for cross modality learning taking the input from each individual net's penultimate layer. Finally, to account for temporal dynamics, the learned shared representations are used for estimating the emission probability to infer action sequences. In particular, we demonstrate that multimodal feature learning will extract semantically meaningful shared representations, outperforming individual modalities, and the early fusion scheme's efficacy against the traditional method of late fusion.
引用
收藏
页码:945 / 948
页数:4
相关论文
共 50 条
  • [1] Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition
    Wu, Di
    Pigou, Lionel
    Kindermans, Pieter-Jan
    Nam Do-Hoang Le
    Shao, Ling
    Dambre, Joni
    Odobez, Jean-Marc
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2016, 38 (08) : 1583 - 1597
  • [2] Multimodal Spatiotemporal Feature Map for Dynamic Gesture Recognition
    Zhang X.
    Zeng X.
    Sun W.
    Ren Y.
    Xu T.
    Computer Systems Science and Engineering, 2023, 46 (01): : 671 - 686
  • [3] Dynamic Gesture Recognition Based On Multimodal Fusion Model
    Fang, Juan
    Xu, Chao
    Wang, Chao
    Li, Hua
    20TH INT CONF ON UBIQUITOUS COMP AND COMMUNICAT (IUCC) / 20TH INT CONF ON COMP AND INFORMATION TECHNOLOGY (CIT) / 4TH INT CONF ON DATA SCIENCE AND COMPUTATIONAL INTELLIGENCE (DSCI) / 11TH INT CONF ON SMART COMPUTING, NETWORKING, AND SERV (SMARTCNS), 2021, : 172 - 177
  • [4] Challenges in multimodal gesture recognition
    Escalera, Sergio
    Athitsos, Vassilis
    Guyon, Isabelle
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [5] Large-scale Multimodal Gesture Recognition Using Heterogeneous Networks
    Wang, Huogen
    Wang, Pichao
    Song, Zhanjie
    Li, Wanqing
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 3129 - 3137
  • [6] Deep Dynamic Neural Networks for Gesture Segmentation and Recognition
    Wu, Di
    Shao, Ling
    COMPUTER VISION - ECCV 2014 WORKSHOPS, PT I, 2015, 8925 : 552 - 571
  • [7] Hand Gesture Recognition with Convolutional Neural Networks for the Multimodal UAV Control
    Ma, Yuntao
    Liu, Yuxuan
    Fin, Ruiyang
    Yuan, Xingyang
    Sekha, Raza
    Wilson, Samuel
    Vaidyanathan, Ravi
    2017 WORKSHOP ON RESEARCH, EDUCATION AND DEVELOPMENT OF UNMANNED AERIAL SYSTEMS (RED-UAS), 2017, : 198 - 203
  • [8] Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition with Multimodal Training
    Abavisani, Mahdi
    Joze, Hamid Reza Vaezi
    Patel, Vishal M.
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1165 - 1174
  • [9] A Multimodal Dynamic Hand Gesture Recognition Based on Radar-Vision Fusion
    Liu, Haoming
    Liu, Zhenyu
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [10] CAPSULE TRANSFORMER NETWORK FOR DYNAMIC HAND GESTURE RECOGNITION USING MULTIMODAL DATA
    Lebas, Alexandre
    Slama, Rim
    Wannous, Hazem
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2130 - 2134