Select and Focus: Action Recognition with Spatial-Temporal Attention

被引：0

作者：

Chan, Wensong ^{[1
]}

Tian, Zhiqiang ^{[1
]}

Liu, Shuai ^{[1
]}

Ren, Jing ^{[2
]}

Lan, Xuguang ^{[3
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian, Peoples R China

[2] Xian Aeronaut Univ, Xian, Peoples R China

[3] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Peoples R China

来源：

INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PT III | 2019年 / 11742卷

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

Human action recognition; Deep learning; Attention;

D O I：

10.1007/978-3-030-27535-8_41

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the rapid development of neural networks, human action recognition has been achieved great improvement by using convolutional neural networks (CNN) or recurrent neural networks (RNN). In this paper, we propose a model based on weighted spatial-temporal attention for action recognition. This model selects the key parts in each video frame and important frames in each video sequence. Then the model focuses on analyzing these key parts and frames. Therefore, the most important tasks of our model is to find out the key parts spatially and the important frames temporally for recognizing the action. Our model is trained and tested on three datasets including UCF-11, UCF-101, and HMDB51. The experiments demonstrate that our model can achieve a satisfactory result for human action recognition.

引用

页码：461 / 471

页数：11

共 50 条

[41] Spatial-Temporal Action Localization With Hierarchical Self-Attention
Pramono, Rizard Renanda Adhi
Chen, Yie-Tarng
Fang, Wen-Hsien
IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 625 - 639
[42] Integrating Temporal and Spatial Attention for Video Action Recognition
Zhou, Yuanding
Li, Baopu
Wang, Zhihui
Li, Haojie
SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
[43] Spatial-Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
Wei, Guixiang
Zhou, Huijian
Zhang, Liping
Wang, Jianji
SENSORS, 2023, 23 (10)
[44] Beyond coordinate attention: spatial-temporal recalibration and channel scaling for skeleton-based action recognition
Tang, Jun
Gong, Sihang
Wang, Yanjiang
Liu, Baodi
Du, Chunyu
Gu, Boyang
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (01) : 199 - 206
[45] Beyond coordinate attention: spatial-temporal recalibration and channel scaling for skeleton-based action recognition
Jun Tang
Sihang Gong
Yanjiang Wang
Baodi Liu
Chunyu Du
Boyang Gu
Signal, Image and Video Processing, 2024, 18 : 199 - 206
[46] Improved SSD using deep multi-scale attention spatial-temporal features for action recognition
Zhou, Shuren
Qiu, Jia
Solanki, Arun
MULTIMEDIA SYSTEMS, 2022, 28 (06) : 2123 - 2131
[47] Hierarchy Spatial-Temporal Transformer for Action Recognition in Short Videos
Cai, Guoyong
Cai, Yumeng
FUZZY SYSTEMS AND DATA MINING VI, 2020, 331 : 760 - 774
[48] Action Recognition Based on Spatial-Temporal Pyramid Sparse Coding
Zhang, Xiaojing
Zhang, Hua
Cao, Xiaochun
2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 1455 - 1458
[49] Hierarchical Spatial-Temporal Masked Contrast for Skeleton Action Recognition
Cao, Wenming
Zhang, Aoyu
He, Zhihai
Zhang, Yicha
Yin, Xinpeng
IEEE Transactions on Artificial Intelligence, 2024, 5 (11): : 5801 - 5814
[50] Multi-Branch Spatial-Temporal Network for Action Recognition
Wang, Yingying
Li, Wei
Tao, Ran
IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (10) : 1556 - 1560

← 1 2 3 4 5 →