Residual attention unit for action recognition

被引：7

作者：

Liao, Zhongke ^{[1
]}

Hu, Haifeng ^{[1
]}

Zhang, Junxuan ^{[1
]}

Yin, Chang ^{[1
]}

机构：

[1] Sun Yat Sen Univ, Sch Elect & Informat Technol, Guangzhou, Guangdong, Peoples R China

来源：

COMPUTER VISION AND IMAGE UNDERSTANDING | 2019年 / 189卷

基金：

中国国家自然科学基金;

关键词：

Action recognition; Residual learning; Attention; Background motion;

D O I：

10.1016/j.cviu.2019.102821

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

3D CNNs are powerful tools for action recognition that can intuitively extract spatio-temporal features from raw videos. However, most of the existing 3D CNNs have not fully considered the disadvantageous effects of the background motion that frequently appears in videos. The background motion is usually misclassified as a part of human action, which may undermine modeling the dynamic pattern of the action. In this paper, we propose the residual attention unit (RAU) to address this problem. RAU aims to suppress the background motion by upweighting the values associated with the foreground region in the feature maps. Specifically, RAU contains two separate submodules in parallel, i.e., spatial attention as well as channel-wise attention. Given an intermediate feature map, the spatial attention works in a bottom-up top-down manner to generate the attention mask, while the channel-wise attention recalibrates the feature responses of all channels automatically. As applying the attention mechanism directly to the input features may lead to the loss of discriminative information, we design a bypass to preserve the integrity of the original features by a shortcut connection between the input and output of the attention module. Notably, our RAU can be embedded into 3D CNNs easily and enables end-to-end training along with the networks. The experimental results on UCF101 and HMDB51 demonstrate the validity of our RAU.

引用

页数：8

共 50 条

[1] Residual attention fusion network for video action recognition
Li, Ao
Yi, Yang
Liang, Daan
[J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2024, 98
[2] Facial Action Unit Recognition by Prior and Adaptive Attention
Shao, Zhiwen
Zhou, Yong
Zhu, Hancheng
Du, Wen-Liang
Yao, Rui
Chen, Hao
[J]. ELECTRONICS, 2022, 11 (19)
[3] Learning Guided Attention Masks for Facial Action Unit Recognition
Lakshminarayana, Nagashri
Setlur, Srirangaraj
Govindaraju, Venu
[J]. 2020 15TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE AND GESTURE RECOGNITION (FG 2020), 2020, : 465 - 472
[4] Video action recognition method based on attention residual network and LSTM
Zhang, Yu
Dong, Pengyue
[J]. PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 3611 - 3616
[5] Action recognition based on attention mechanism and depthwise separable residual module
Li, Hui
Hu, Wenjun
Zang, Ying
Zhao, Shuguang
[J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (01) : 57 - 65
[6] Action recognition based on attention mechanism and depthwise separable residual module
Hui Li
Wenjun Hu
Ying Zang
Shuguang Zhao
[J]. Signal, Image and Video Processing, 2023, 17 : 57 - 65
[7] Attention-enhanced gated recurrent unit for action recognition in tennis
Gao, Meng
Ju, Bingchun
[J]. PEERJ COMPUTER SCIENCE, 2024, 10
[8] Separable 3D residual attention network for human action recognition
Zufan Zhang
Yue Peng
Chenquan Gan
Andrea Francesco Abate
Lianxiang Zhu
[J]. Multimedia Tools and Applications, 2023, 82 : 5435 - 5453
[9] Separable 3D residual attention network for human action recognition
Zhang, Zufan
Peng, Yue
Gan, Chenquan
Abate, Andrea Francesco
Zhu, Lianxiang
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (04) : 5435 - 5453
[10] Facial Action Unit Recognition Based on Self-Attention Spatiotemporal Fusion
Liang, Chaolei
Zou, Wei
Hu, Danfeng
Wang, JiaJun
[J]. 2024 5TH INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKS AND INTERNET OF THINGS, CNIOT 2024, 2024, : 600 - 605

← 1 2 3 4 5 →