A spatiotemporal attention-based ResC3D model for large-scale gesture recognition

被引:16
|
作者
Li, Yunan [1 ,2 ]
Miao, Qiguang [1 ,2 ]
Qi, Xiangda [1 ,2 ]
Ma, Zhenxin [1 ,2 ]
Ouyang, Wanli [3 ]
机构
[1] Xidian Univ, Sch Comp Sci & Technol, Xian, Shaanxi, Peoples R China
[2] Xian Key Lab Big Data & Intelligent Vis, Xian, Shaanxi, Peoples R China
[3] Univ Sydney, Sch Elect & Informat Engn, Sydney, NSW, Australia
基金
国家重点研发计划;
关键词
Gesture recognition; Spatiotemporal attention mechanism; ResC3D model; BEHAVIOR DETECTION; OPTICAL-FLOW; FUSION; SCENES;
D O I
10.1007/s00138-018-0996-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Abnormal gesture recognition has many applications in the fields of visual surveillance, crowd behavior analysis, and sensitive video content detection. However, the recognition of dynamic gestures with large-scale videos remains a challenging task due to the barriers of gesture-irrelevant factors like the variations in illumination, movement path, and background. In this paper, we propose a spatiotemporal attention-based ResC3D model for abnormal gesture recognition with large-scale videos. One key idea is to find a compact and effective representation of the gesture in both spatial and temporal contexts. To eliminate the influence of gesture-irrelevant factors, we first employ the enhancement techniques such as Retinex and hybrid median filer to improve the quality of RGB and depth inputs. Then, we design a spatiotemporal attention scheme to focus on the most valuable cues related to the moving parts for the gesture. Upon these representations, a ResC3D network, which leverages the advantages of both residual network and C3D model, is developed to extract features, together with a canonical correlation analysis-based fusion scheme for blending features from different modalities. The performance of our method is evaluated on the Chalearn IsoGD Dataset. Experiments demonstrate the effectiveness of each module of our method and show the ultimate accuracy reaches 68.14%, which outperforms other state-of-the-art methods, including our basic work in 2017 Chalearn Looking at People Workshop of ICCV.
引用
收藏
页码:875 / 888
页数:14
相关论文
共 50 条
  • [41] Improving Attention-Based Handwritten Mathematical Expression Recognition with Scale Augmentation and Drop Attention
    Li, Zhe
    Jin, Lianwen
    Lai, Songxuan
    Zhu, Yecheng
    [J]. 2020 17TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR 2020), 2020, : 175 - 180
  • [42] 3D Gesture-Based View Manipulator for Large Scale Entity Model Review
    Park, Hye-Jin
    Park, Jiyoung
    Kim, Myoung-Hee
    [J]. ASIASIM 2012, PT I, 2012, 323 : 524 - +
  • [43] Attention-Based Deep Ensemble Net for Large-Scale Online Taxi-Hailing Demand Prediction
    Liu, Yang
    Liu, Zhiyuan
    Lyu, Cheng
    Ye, Jieping
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (11) : 4798 - 4807
  • [44] Attention-based Contextual Language Model Adaptation for Speech Recognition
    Martinez, Richard Diehl
    Novotney, Scott
    Bulyko, Ivan
    Rastrow, Ariya
    Stolcke, Andreas
    Gandhe, Ankur
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 1994 - 2003
  • [45] Attention-Based Spatiotemporal-Aware Network for Fine-Grained Visual Recognition
    Ren, Yili
    Lu, Ruidong
    Yuan, Guan
    Hao, Dashuai
    Li, Hongjue
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [46] END-TO-END ATTENTION-BASED LARGE VOCABULARY SPEECH RECOGNITION
    Bandanau, Dzmitry
    Chorowski, Jan
    Serdyuk, Dmitriy
    Brakel, Philemon
    Bengio, Yoshua
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 4945 - 4949
  • [47] Multimodal Dependence Attention and Large-Scale Data Based Offline Handwritten Formula Recognition
    Liu, Han-Chao
    Dong, Lan-Fang
    Zhang, Xin-Ming
    [J]. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2024, 39 (03) : 654 - 670
  • [48] One-shot gesture recognition with attention-based DTW for human-robot collaboration
    Kuang, Yiqun
    Cheng, Hong
    Zheng, Yali
    Cui, Fang
    Huang, Rui
    [J]. ASSEMBLY AUTOMATION, 2020, 40 (01) : 40 - 47
  • [49] Two Streams Recurrent Neural Networks for Large-Scale Continuous Gesture Recognition
    Chai, Xiujuan
    Liu, Zhipeng
    Yin, Fang
    Liu, Zhuang
    Chen, Xilin
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 31 - 36
  • [50] Attention-Based Cross-Domain Gesture Recognition Using WiFi Channel State Information
    Hong, Hao
    Huang, Baoqi
    Gu, Yu
    Jia, Bing
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2021, PT II, 2022, 13156 : 571 - 585