Local-aware spatio-temporal attention network with multi-stage feature fusion for human action recognition

被引：0

作者：

Yaqing Hou

Hua Yu

Dongsheng Zhou

Pengfei Wang

Hongwei Ge

Jianxin Zhang

Qiang Zhang

机构：

[1] Dalian University of Technology,School of Computer Science and Technology

[2] Dalian University,School of Software Engineering

[3] Dalian Minzu University,School of Computer Science and Engineering

来源：

Neural Computing and Applications | 2021年 / 33卷

关键词：

Spatio-temporal attention networks; Spatial transformer network; Feature fusion; Human action recognition;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In the study of human action recognition, two-stream networks have made excellent progress recently. However, there remain challenges in distinguishing similar human actions in videos. This paper proposes a novel local-aware spatio-temporal attention network with multi-stage feature fusion based on compact bilinear pooling for human action recognition. To elaborate, taking two-stream networks as our essential backbones, the spatial network first employs multiple spatial transformer networks in a parallel manner to locate the discriminative regions related to human actions. Then, we perform feature fusion between the local and global features to enhance the human action representation. Furthermore, the output of the spatial network and the temporal information are fused at a particular layer to learn the pixel-wise correspondences. After that, we bring together three outputs to generate the global descriptors of human actions. To verify the efficacy of the proposed approach, comparison experiments are conducted with the traditional hand-engineered IDT algorithms, the classical machine learning methods (i.e., SVM) and the state-of-the-art deep learning methods (i.e., spatio-temporal multiplier networks). According to the results, our approach is reported to obtain the best performance among existing works, with the accuracy of 95.3% and 72.9% on UCF101 and HMDB51, respectively. The experimental results thus demonstrate the superiority and significance of the proposed architecture in solving the task of human action recognition.

引用

页码：16439 / 16450

页数：11

共 50 条

[41] Spatio-temporal information for human action recognition
Li Yao
Yunjian Liu
Shihui Huang
EURASIP Journal on Image and Video Processing, 2016
[42] PASTFNet: a paralleled attention spatio-temporal fusion network for micro-expression recognition
Tian, Haichen
Gong, Weijun
Li, Wei
Qian, Yurong
MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2024, 62 (06) : 1911 - 1924
[43] Human action categorization using discriminative local spatio-temporal feature weighting
Ghodrati, Amir
Kasaei, Shohreh
INTELLIGENT DATA ANALYSIS, 2012, 16 (04) : 537 - 550
[44] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
Lu, Xuemin
Quan, Wei
Marek, Reformat
Zhao, Haiquan
Chen, Jim X. X.
VISUAL COMPUTER, 2024, 40 (05): : 3163 - 3181
[45] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
Xuemin Lu
Wei Quan
Reformat Marek
Haiquan Zhao
Jim X. Chen
The Visual Computer, 2024, 40 : 3163 - 3181
[46] PASTFNet: a paralleled attention spatio-temporal fusion network for micro-expression recognition
Haichen Tian
Weijun Gong
Wei Li
Yurong Qian
Medical & Biological Engineering & Computing, 2024, 62 : 1911 - 1924
[47] Spatio-Temporal Fusion for Human Action Recognition via Joint Trajectory Graph
Zheng, Yaolin
Huang, Hongbo
Wang, Xiuying
Yan, Xiaoxu
Xu, Longfei
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7579 - 7587
[48] Spatio-Temporal Attention Fusion SlowFast for Interrogation Violation Recognition
Wang, Hailun
Dong, Bin
Zhu, Qirui
Chen, Zhiqiang
Chen, Yi
IEEE ACCESS, 2023, 11 : 103801 - 103813
[49] A spatio-temporal attention fusion model for students behaviour recognition
Wang, Xiaoli
EAI ENDORSED TRANSACTIONS ON SCALABLE INFORMATION SYSTEMS, 2022, 9 (34)
[50] Spatio-Temporal Self-Attention Weighted VLAD Neural Network for Action Recognition
Cheng, Shilei
Xie, Mei
Ma, Zheng
Li, Siqi
Gu, Song
Yang, Feng
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (01) : 220 - 224

← 1 2 3 4 5 →