Residual attention unit for action recognition

被引:7
|
作者
Liao, Zhongke [1 ]
Hu, Haifeng [1 ]
Zhang, Junxuan [1 ]
Yin, Chang [1 ]
机构
[1] Sun Yat Sen Univ, Sch Elect & Informat Technol, Guangzhou, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Residual learning; Attention; Background motion;
D O I
10.1016/j.cviu.2019.102821
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D CNNs are powerful tools for action recognition that can intuitively extract spatio-temporal features from raw videos. However, most of the existing 3D CNNs have not fully considered the disadvantageous effects of the background motion that frequently appears in videos. The background motion is usually misclassified as a part of human action, which may undermine modeling the dynamic pattern of the action. In this paper, we propose the residual attention unit (RAU) to address this problem. RAU aims to suppress the background motion by upweighting the values associated with the foreground region in the feature maps. Specifically, RAU contains two separate submodules in parallel, i.e., spatial attention as well as channel-wise attention. Given an intermediate feature map, the spatial attention works in a bottom-up top-down manner to generate the attention mask, while the channel-wise attention recalibrates the feature responses of all channels automatically. As applying the attention mechanism directly to the input features may lead to the loss of discriminative information, we design a bypass to preserve the integrity of the original features by a shortcut connection between the input and output of the attention module. Notably, our RAU can be embedded into 3D CNNs easily and enables end-to-end training along with the networks. The experimental results on UCF101 and HMDB51 demonstrate the validity of our RAU.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Residual attention fusion network for video action recognition
    Li, Ao
    Yi, Yang
    Liang, Daan
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 98
  • [2] Facial Action Unit Recognition by Prior and Adaptive Attention
    Shao, Zhiwen
    Zhou, Yong
    Zhu, Hancheng
    Du, Wen-Liang
    Yao, Rui
    Chen, Hao
    [J]. ELECTRONICS, 2022, 11 (19)
  • [3] Learning Guided Attention Masks for Facial Action Unit Recognition
    Lakshminarayana, Nagashri
    Setlur, Srirangaraj
    Govindaraju, Venu
    [J]. 2020 15TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2020), 2020, : 465 - 472
  • [4] Video action recognition method based on attention residual network and LSTM
    Zhang, Yu
    Dong, Pengyue
    [J]. PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3611 - 3616
  • [5] Action recognition based on attention mechanism and depthwise separable residual module
    Li, Hui
    Hu, Wenjun
    Zang, Ying
    Zhao, Shuguang
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (01) : 57 - 65
  • [6] Action recognition based on attention mechanism and depthwise separable residual module
    Hui Li
    Wenjun Hu
    Ying Zang
    Shuguang Zhao
    [J]. Signal, Image and Video Processing, 2023, 17 : 57 - 65
  • [7] Attention-enhanced gated recurrent unit for action recognition in tennis
    Gao, Meng
    Ju, Bingchun
    [J]. PEERJ COMPUTER SCIENCE, 2024, 10
  • [8] Separable 3D residual attention network for human action recognition
    Zufan Zhang
    Yue Peng
    Chenquan Gan
    Andrea Francesco Abate
    Lianxiang Zhu
    [J]. Multimedia Tools and Applications, 2023, 82 : 5435 - 5453
  • [9] Separable 3D residual attention network for human action recognition
    Zhang, Zufan
    Peng, Yue
    Gan, Chenquan
    Abate, Andrea Francesco
    Zhu, Lianxiang
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (04) : 5435 - 5453
  • [10] Facial Action Unit Recognition Based on Self-Attention Spatiotemporal Fusion
    Liang, Chaolei
    Zou, Wei
    Hu, Danfeng
    Wang, JiaJun
    [J]. 2024 5TH INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKS AND INTERNET OF THINGS, CNIOT 2024, 2024, : 600 - 605