Memory Attention Networks for Skeleton-Based Action Recognition

被引:88
|
作者
Li, Ce [1 ,2 ,3 ]
Xie, Chunyu [2 ,3 ]
Zhang, Baochang [2 ,3 ]
Han, Jungong [4 ]
Zhen, Xiantong [5 ,6 ]
Chen, Jie [7 ,8 ]
机构
[1] China Univ Min & Technol, Beijing 100083, Peoples R China
[2] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China
[3] Shenzhen Acad Aerosp Technol, Shenzhen 518057, Peoples R China
[4] Aberystwyth Univ, Dept Comp Sci, Aberystwyth SY23 3FL, Dyfed, Wales
[5] Univ Amsterdam, NL-1012 WX Amsterdam, Netherlands
[6] Incept Inst Artificial Intelligence, Abu Dhabi, U Arab Emirates
[7] Peking Univ, Sch Elect & Comp Engn, Beijing 100871, Peoples R China
[8] Peng Cheng Lab, Shenzhen 518055, Peoples R China
关键词
Skeleton; Spatiotemporal phenomena; Convolution; Feature extraction; Computer architecture; Collaboration; Learning systems; Collaborative memory fusion module (CMFM); memory attention networks (MANs); skeleton-based action recognition; spatiotemporal convolution module (STCM); temporal attention recalibration module;
D O I
10.1109/TNNLS.2021.3061115
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Skeleton-based action recognition has been extensively studied, but it remains an unsolved problem because of the complex variations of skeleton joints in 3-D spatiotemporal space. To handle this issue, we propose a newly temporal-then-spatial recalibration method named memory attention networks (MANs) and deploy MANs using the temporal attention recalibration module (TARM) and spatiotemporal convolution module (STCM). In the TARM, a novel temporal attention mechanism is built based on residual learning to recalibrate frames of skeleton data temporally. In the STCM, the recalibrated sequence is transformed or encoded as the input of CNNs to further model the spatiotemporal information of skeleton sequence. Based on MANs, a new collaborative memory fusion module (CMFM) is proposed to further improve the efficiency, leading to the collaborative MANs (C-MANs), trained with two streams of base MANs. TARM, STCM, and CMFM form a single network seamlessly and enable the whole network to be trained in an end-to-end fashion. Comparing with the state-of-the-art methods, MANs and C-MANs improve the performance significantly and achieve the best results on six data sets for action recognition. The source code has been made publicly available at https://github.com/memory-attention-networks.
引用
收藏
页码:4800 / 4814
页数:15
相关论文
共 50 条
  • [1] Memory Attention Networks for Skeleton-based Action Recognition
    Xie, Chunyu
    Li, Ce
    Zhang, Baochang
    Chen, Chen
    Han, Jungong
    Liu, Jianzhuang
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 1639 - 1645
  • [2] Adaptive Attention Memory Graph Convolutional Networks for Skeleton-Based Action Recognition
    Liu, Di
    Xu, Hui
    Wang, Jianzhong
    Lu, Yinghua
    Kong, Jun
    Qi, Miao
    [J]. SENSORS, 2021, 21 (20)
  • [3] Multi-Term Attention Networks for Skeleton-Based Action Recognition
    Diao, Xiaolei
    Li, Xiaoqiang
    Huang, Chen
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (15):
  • [4] Insight on Attention Modules for Skeleton-Based Action Recognition
    Jiang, Quanyan
    Wu, Xiaojun
    Kittler, Josef
    [J]. PATTERN RECOGNITION AND COMPUTER VISION, PT I, 2021, 13019 : 242 - 255
  • [5] Spatial-temporal graph attention networks for skeleton-based action recognition
    Huang, Qingqing
    Zhou, Fengyu
    He, Jiakai
    Zhao, Yang
    Qin, Runze
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2020, 29 (05)
  • [6] View transform graph attention recurrent networks for skeleton-based action recognition
    Huang, Qingqing
    Zhou, Fengyu
    Qin, Runze
    Zhao, Yang
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2021, 15 (03) : 599 - 606
  • [7] View transform graph attention recurrent networks for skeleton-based action recognition
    Qingqing Huang
    Fengyu Zhou
    Runze Qin
    Yang zhao
    [J]. Signal, Image and Video Processing, 2021, 15 : 599 - 606
  • [8] Attention adjacency matrix based graph convolutional networks for skeleton-based action recognition
    Xie, Jun
    Miao, Qiguang
    Liu, Ruyi
    Xin, Wentian
    Tang, Lei
    Zhong, Sheng
    Gao, Xuesong
    [J]. NEUROCOMPUTING, 2021, 440 : 230 - 239
  • [9] Focus on temporal graph convolutional networks with unified attention for skeleton-based action recognition
    Gao, Bing-Kun
    Dong, Le
    Bi, Hong-Bo
    Bi, Yun-Ze
    [J]. APPLIED INTELLIGENCE, 2022, 52 (05) : 5608 - 5616
  • [10] SKELETON-BASED ACTION RECOGNITION WITH CONVOLUTIONAL NEURAL NETWORKS
    Li, Chao
    Zhong, Qiaoyong
    Xie, Di
    Pu, Shiliang
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,