Finger Gesture Spotting from Long Sequences Based on Multi-Stream Recurrent Neural Networks

被引:7
|
作者
Benitez-Garcia, Gibran [1 ,3 ]
Haris, Muhammad [1 ]
Tsuda, Yoshiyuki [2 ]
Ukita, Norimichi [1 ]
机构
[1] Toyota Technol Inst, Nagoya, Aichi 4688511, Japan
[2] DENSO Corp, Kariya, Aichi 4488661, Japan
[3] Univ Electrocommun, Dept Informat, Chofu, Tokyo 1828585, Japan
关键词
gesture spotting; human-computer interaction; automotive user interfaces; in-vehicle sensors; recurrent neural networks;
D O I
10.3390/s20020528
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Gesture spotting is an essential task for recognizing finger gestures used to control in-car touchless interfaces. Automated methods to achieve this task require to detect video segments where gestures are observed, to discard natural behaviors of users' hands that may look as target gestures, and be able to work online. In this paper, we address these challenges with a recurrent neural architecture for online finger gesture spotting. We propose a multi-stream network merging hand and hand-location features, which help to discriminate target gestures from natural movements of the hand, since these may not happen in the same 3D spatial location. Our multi-stream recurrent neural network (RNN) recurrently learns semantic information, allowing to spot gestures online in long untrimmed video sequences. In order to validate our method, we collect a finger gesture dataset in an in-vehicle scenario of an autonomous car. 226 videos with more than 2100 continuous instances were captured with a depth sensor. On this dataset, our gesture spotting approach outperforms state-of-the-art methods with an improvement of about 10% and 15% of recall and precision, respectively. Furthermore, we demonstrated that by combining with an existing gesture classifier (a 3D Convolutional Neural Network), our proposal achieves better performance than previous hand gesture recognition methods.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Multi-Stream Deep Neural Networks for RGB-D Egocentric Action Recognition
    Tang, Yansong
    Wang, Zian
    Lu, Jiwen
    Feng, Jianjiang
    Zhou, Jie
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (10) : 3001 - 3015
  • [32] A MULTI-STREAM NETWORK FOR MESH DENOISING VIA GRAPH NEURAL NETWORKS WITH GAUSSIAN CURVATURE
    Zhao, Zhibo
    Wu, Wenhui
    Liu, Hongjie
    Gong, Yuanhao
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1355 - 1359
  • [33] Multi-Stream Long Short-Term Memory Neural Network Language Model
    Arisoy, Ebru
    Saraclar, Murat
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1413 - 1417
  • [34] Hand Gesture Recognition in Video Sequences Using Deep Convolutional and Recurrent Neural Networks
    Obaid, Falah
    Babadi, Amin
    Yoosofan, Ahmad
    APPLIED COMPUTER SYSTEMS, 2020, 25 (01) : 57 - 61
  • [35] Multi-stream Information-Based Neural Network for Mammogram Mass Segmentation
    Li, Zhilin
    Deng, Zijian
    Chen, Li
    Gui, Yu
    Cai, Zhigang
    Liao, Jianwei
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT I, 2022, 13529 : 267 - 278
  • [36] Human Muscle sEMG Signal and Gesture Recognition Technology Based on Multi-Stream Feature Fusion Network
    Wang, Xiaoyun
    EAI Endorsed Transactions on Pervasive Health and Technology, 2024, 10
  • [37] A Multi-Stream Bi-Directional Recurrent Neural Network for Fine-Grained Action Detection
    Singh, Bharat
    Marks, Tim K.
    Jones, Michael
    Tuzel, Oncel
    Shao, Ming
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1961 - 1970
  • [38] APPLICATION OF PROGRESSIVE NEURAL NETWORKS FOR MULTI-STREAM WFST COMBINATION IN ONE-PASS DECODING
    Xu, Sirui
    Fosler-Lussier, Eric
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5914 - 5918
  • [39] Real-Time and Continuous Hand Gesture Spotting: an Approach Based on Artificial Neural Networks
    Neto, Pedro
    Pereira, Dario
    Norberto Pires, J.
    Paulo Moreira, A.
    2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2013, : 178 - 183
  • [40] Multimodal Conversation Emotion Recognition Combining Multi- Level Attention and Multi-Stream Graph Neural Networks
    Feng, Hongqi
    Guo, Yongxiang
    Zhang, Denghui
    Yang, Xinli
    Computer Engineering and Applications, 2024, 60 (21) : 154 - 163