A Multi Modal Approach to Gesture Recognition from Audio and Video Data

被引:9
|
作者
Bayer, Immanuel [1 ]
Silbermann, Thierry [1 ]
机构
[1] Univ Konstanz, D-78457 Constance, Germany
关键词
Multi-modal interaction; speech and gesture recognition; fusion;
D O I
10.1145/2522848.2532592
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We describe in this paper our approach for the Multi-modal gesture recognition challenge organized by ChaLearn in conjunction with the ICMI 2013 conference. The competition's task was to learn a vocabulary of 20 types of Italian gestures performed from different persons and to detect them in sequences. We develop an algorithm to find the gesture intervals in the audio data, extract audio features from those intervals and train two different models. We engineer features from the skeleton data and use the gesture intervals in the training data to train a model that we afterwards apply to the test sequences using a sliding window. We combine the models through weighted averaging. We find that this way to combine information from two different sources boosts the models performance significantly.
引用
收藏
页码:461 / 465
页数:5
相关论文
共 50 条
  • [1] A Multi-modal Gesture Recognition System Using Audio, Video, and Skeletal Joint Data
    Nandakumar, Karthik
    Wah, Wan Kong
    Alice, Chan Siu Man
    Terence, Ng Wen Zheng
    Gang, Wang Jian
    Yun, Yau Wei
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 475 - 482
  • [2] Multi-modal Gesture Recognition using Integrated Model of Motion, Audio and Video
    GOUTSU Yusuke
    KOBAYASHI Takaki
    OBARA Junya
    KUSAJIMA Ikuo
    TAKEICHI Kazunari
    TAKANO Wataru
    NAKAMURA Yoshihiko
    Chinese Journal of Mechanical Engineering, 2015, (04) : 657 - 665
  • [3] Multi-modal Gesture Recognition using Integrated Model of Motion, Audio and Video
    Goutsu, Yusuke
    Kobayashi, Takaki
    Obara, Junya
    Kusajima, Ikuo
    Takeichi, Kazunari
    Takano, Wataru
    Nakamura, Yoshihiko
    CHINESE JOURNAL OF MECHANICAL ENGINEERING, 2015, 28 (04) : 657 - 665
  • [4] Multi-modal gesture recognition using integrated model of motion, audio and video
    Yusuke Goutsu
    Takaki Kobayashi
    Junya Obara
    Ikuo Kusajima
    Kazunari Takeichi
    Wataru Takano
    Yoshihiko Nakamura
    Chinese Journal of Mechanical Engineering, 2015, 28 : 657 - 665
  • [5] Multi-modal Gesture Recognition using Integrated Model of Motion, Audio and Video
    GOUTSU Yusuke
    KOBAYASHI Takaki
    OBARA Junya
    KUSAJIMA Ikuo
    TAKEICHI Kazunari
    TAKANO Wataru
    NAKAMURA Yoshihiko
    Chinese Journal of Mechanical Engineering, 2015, 28 (04) : 657 - 665
  • [6] Erratum to: Multi-modal Gesture Recognition using Integrated Model of Motion, Audio and Video
    GOUTSU Yusuke
    KOBAYASHI Takaki
    OBARA Junya
    KUSAJIMA Ikuo
    TAKEICHI Kazunari
    TAKANO Wataru
    NAKAMURA Yoshihiko
    Chinese Journal of Mechanical Engineering, 2017, 30 : 1473 - 1473
  • [7] Multi-Modal Emotion Recognition Fusing Video and Audio
    Xu, Chao
    Du, Pufeng
    Feng, Zhiyong
    Meng, Zhaopeng
    Cao, Tianyi
    Dong, Caichao
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2013, 7 (02): : 455 - 462
  • [8] Multi-modal Gesture Recognition using Integrated Model of Motion, Audio and Video (vol 28, pg 657, 2015)
    Goutsu, Yusuke
    Kobayashi, Takaki
    Obara, Junya
    Kusajima, Ikuo
    Takeichi, Kazunari
    Takano, Wataru
    Nakamura, Yoshihiko
    CHINESE JOURNAL OF MECHANICAL ENGINEERING, 2017, 30 (06) : 1473 - 1473
  • [9] Non-audio-Video Gesture Recognition Systems
    Craciunescu, Razvan
    WIRELESS PERSONAL COMMUNICATIONS, 2020, 110 (02) : 815 - 827
  • [10] Non-audio–Video Gesture Recognition Systems
    Razvan Craciunescu
    Wireless Personal Communications, 2020, 110 : 815 - 827