Action Recognition Using Visual Attention with Reinforcement Learning

被引：11

作者：

Li, Hongyang ^{[1
,3
]}

Chen, Jun ^{[1
,2
]}

Hu, Ruimin ^{[1
,2
]}

Yu, Mei ^{[3
]}

Chen, Huafeng ^{[4
]}

Xu, Zengmin ^{[1
]}

机构：

[1] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Wuhan, Peoples R China

[2] Wuhan Univ, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan, Peoples R China

[3] China Three Gorges Univ, Coll Comp & Informat Technol, Yichang, Peoples R China

[4] Jingchu Univ Technol, Jingmen, Peoples R China

来源：

MULTIMEDIA MODELING, MMM 2019, PT II | 2019年 / 11296卷

关键词：

Human action recognition; Reinforcement learning; Visual attention;

D O I：

10.1007/978-3-030-05716-9_30

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Human action recognition in videos is a challenging and significant task with a broad range of applications. The advantage of the visual attention mechanism is that it can effectively reduce noise interference by focusing on the relevant parts of the image and ignoring the irrelevant part. We propose a deep visual attention model with reinforcement learning for this task. We use Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM) units as a learning agent. The agent interact with video and decides both where to look next frame and where to locate the most relevant region of the selected video frame. REINFORCE method is used to learn the agent's decision policy and back-propagation method is used to train the action classifier. The experimental results demonstrate that this glimpse window can focus on important clues. Our model achieves significant performance improvement on the action recognition datasets: UCF101 and HMDB51.

引用

页码：365 / 376

页数：12

共 50 条

[21] PREDICTABILITY ANALYZING: DEEP REINFORCEMENT LEARNING FOR EARLY ACTION RECOGNITION
Chen, Xiaokai
Gao, Ke
Caol, Juan
[J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 958 - 963
[22] Digital Image Steganalysis Based on Visual Attention and Deep Reinforcement Learning
Hu, Donghui
Zhou, Shengnan
Shen, Qiang
Zheng, Shuli
Zhou, Zhongqiu
Fan, Yuqi
[J]. IEEE ACCESS, 2019, 7 : 25924 - 25935
[23] Attention control with reinforcement learning for face recognition under partial occlusion
Ehsan Norouzi
Majid Nili Ahmadabadi
Babak Nadjar Araabi
[J]. Machine Vision and Applications, 2011, 22 : 337 - 348
[24] Attention control with reinforcement learning for face recognition under partial occlusion
Norouzi, Ehsan
Ahmadabadi, Majid Nili
Araabi, Babak Nadjar
[J]. MACHINE VISION AND APPLICATIONS, 2011, 22 (02) : 337 - 348
[25] Attention-aware Deep Reinforcement Learning for Video Face Recognition
Rao, Yongming
Lu, Jiwen
Zhou, Jie
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3951 - 3960
[26] Visual Surveillance using Deep Reinforcement Learning
Choi, Keong-Hun
Ha, Jong-Eun
[J]. 2020 20TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2020, : 289 - 291
[27] Learning Discriminative Visual Codebook for Human Action Recognition
Lei, Qing
Li, Shao-zi
Zhang, Hong-bo
[J]. JOURNAL OF COMPUTERS, 2013, 8 (12) : 3093 - 3102
[28] Mask-Attention A3C: Visual Explanation of Action-State Value in Deep Reinforcement Learning
Itaya, Hidenori
Hirakawa, Tsubasa
Yamashita, Takayoshi
Fujiyoshi, Hironobu
Sugiura, Komei
[J]. IEEE ACCESS, 2024, 12 : 86553 - 86571
[29] Boxless Action Recognition in Still Images via Recurrent Visual Attention
Feng, Weijiang
Zhang, Xiang
Huang, Xuhui
Luo, Zhigang
[J]. NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 663 - 673
[30] Jumping Action Recognition for Figure Skating Video in IoT Using Improved Deep Reinforcement Learning
Liu, Yu
Zhou, Ning
[J]. INFORMATION TECHNOLOGY AND CONTROL, 2023, 52 (02): : 309 - 321

← 1 2 3 4 5 →