Temporal Hierarchical Dictionary with HMM for Fast Gesture Recognition

被引:0
|
作者
Chen, Haoyu [1 ]
Liu, Xin [1 ]
Zhao, Guoying [1 ]
机构
[1] Univ Oulu, Ctr Machine Vis & Signal Anal, Oulu, Finland
基金
芬兰科学院; 中国国家自然科学基金;
关键词
Hidden Markov Model; hierarchical structure; Deep Neural Network; Relative Entropy;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a novel temporal hierarchical dictionary with hidden Markov model (HMM) for gesture recognition task. Dictionaries with spatio-temporal elements have been commonly used for gesture recognition. However, the existing spatio-temporal dictionary based methods need the whole pre-segmented gestures for inference, thus are hard to deal with non-stationary sequences. The proposed method combines HMM with Deep Belief Networks (DBN) to tackle both gesture segmentation and recognition by the inference at the frame level. Besides, we investigate the redundancy in dictionaries and introduce the relative entropy to measure the information richness of a dictionary. Furthermore, when inferring an element, a temporal hierarchy-flat dictionary will be searched entirely every time in which the temporal structure of gestures isn't utilized sufficiently. The proposed temporal hierarchical dictionary is organized in HMM states and can limit the search range to distinct states. Our framework includes three key novel properties: (1) a temporal hierarchical structure with HMM, which makes both the HMM transition and Viterbi decoding more efficient; (2) a relative entropy model to compress the dictionary with less redundancy; (3) an unsupervised hierarchical clustering algorithm to build a hierarchical dictionary automatically. Our method is evaluated on two gesture datasets and consistently achieves state-of-the-art performance. The results indicate that the dictionary redundancy has a significant impact on the performance which can be tackled by a temporal hierarchy and an entropy model.
引用
收藏
页码:3378 / 3383
页数:6
相关论文
共 50 条
  • [1] Temporal Hierarchical Dictionary Guided Decoding for Online Gesture Segmentation and Recognition
    Chen, Haoyu
    Liu, Xin
    Shi, Jingang
    Zhao, Guoying
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 9689 - 9702
  • [2] Gaze Gesture Recognition with Hierarchical Temporal Memory Networks
    Rozado, David
    Rodriguez, Francisco B.
    Varona, Pablo
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2011, PT I, 2011, 6691 : 1 - 8
  • [3] Comparing Hybrid NN-HMM and RNN for Temporal Modeling in Gesture Recognition
    Granger, Nicolas
    el Yacoubi, Mounim A.
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 147 - 156
  • [4] On HMM Static Hand Gesture Recognition
    Vieriu, Radu-Laurentiu
    Goras, Bogdan
    Goras, Liviu
    2011 10TH INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS), 2011,
  • [5] HMM and NN for Gesture Recognition.
    Delgado-Mata, Carlos
    Lee Cosio, Blanca Miriam
    2010 IEEE ELECTRONICS, ROBOTICS AND AUTOMOTIVE MECHANICS CONFERENCE (CERMA 2010), 2010, : 56 - 61
  • [6] HMM Parameter Reduction for Practical Gesture Recognition
    Rajko, Stjepan
    Qian, Gang
    2008 8TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2008), VOLS 1 AND 2, 2008, : 895 - 900
  • [7] Sign Language Gesture Recognition Using HMM
    Parcheta, Zuzanna
    Martinez-Hinarejos, Carlos-D.
    PATTERN RECOGNITION AND IMAGE ANALYSIS (IBPRIA 2017), 2017, 10255 : 419 - 426
  • [8] Dynamic gesture track recognition based on HMM
    Wu, XJ
    Zhao, ZJ
    Proceedings of 2005 IEEE International Workshop on VLSI Design and Video Technology, 2005, : 169 - 174
  • [9] Understanding HMM training for video gesture recognition
    Liu, NJ
    Lovell, BC
    Kootsookos, PJ
    Davis, RIA
    TENCON 2004 - 2004 IEEE REGION 10 CONFERENCE, VOLS A-D, PROCEEDINGS: ANALOG AND DIGITAL TECHNIQUES IN ELECTRICAL ENGINEERING, 2004, : A567 - A570
  • [10] Developing context sensitive HMM gesture recognition
    Sage, K
    Howell, AJ
    Buxton, H
    GESTURE-BASED COMMUNICATION IN HUMAN-COMPUTER INTERACTION, 2003, 2915 : 277 - 287