Action Recognition Using Visual Attention with Reinforcement Learning

被引:11
|
作者
Li, Hongyang [1 ,3 ]
Chen, Jun [1 ,2 ]
Hu, Ruimin [1 ,2 ]
Yu, Mei [3 ]
Chen, Huafeng [4 ]
Xu, Zengmin [1 ]
机构
[1] Wuhan Univ, Natl Engn Res Ctr Multimedia Software, Sch Comp Sci, Wuhan, Peoples R China
[2] Wuhan Univ, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan, Peoples R China
[3] China Three Gorges Univ, Coll Comp & Informat Technol, Yichang, Peoples R China
[4] Jingchu Univ Technol, Jingmen, Peoples R China
来源
关键词
Human action recognition; Reinforcement learning; Visual attention;
D O I
10.1007/978-3-030-05716-9_30
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human action recognition in videos is a challenging and significant task with a broad range of applications. The advantage of the visual attention mechanism is that it can effectively reduce noise interference by focusing on the relevant parts of the image and ignoring the irrelevant part. We propose a deep visual attention model with reinforcement learning for this task. We use Recurrent Neural Network (RNN) with Long Short-Term Memory (LSTM) units as a learning agent. The agent interact with video and decides both where to look next frame and where to locate the most relevant region of the selected video frame. REINFORCE method is used to learn the agent's decision policy and back-propagation method is used to train the action classifier. The experimental results demonstrate that this glimpse window can focus on important clues. Our model achieves significant performance improvement on the action recognition datasets: UCF101 and HMDB51.
引用
收藏
页码:365 / 376
页数:12
相关论文
共 50 条
  • [21] PREDICTABILITY ANALYZING: DEEP REINFORCEMENT LEARNING FOR EARLY ACTION RECOGNITION
    Chen, Xiaokai
    Gao, Ke
    Caol, Juan
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 958 - 963
  • [22] Digital Image Steganalysis Based on Visual Attention and Deep Reinforcement Learning
    Hu, Donghui
    Zhou, Shengnan
    Shen, Qiang
    Zheng, Shuli
    Zhou, Zhongqiu
    Fan, Yuqi
    [J]. IEEE ACCESS, 2019, 7 : 25924 - 25935
  • [23] Attention control with reinforcement learning for face recognition under partial occlusion
    Ehsan Norouzi
    Majid Nili Ahmadabadi
    Babak Nadjar Araabi
    [J]. Machine Vision and Applications, 2011, 22 : 337 - 348
  • [24] Attention control with reinforcement learning for face recognition under partial occlusion
    Norouzi, Ehsan
    Ahmadabadi, Majid Nili
    Araabi, Babak Nadjar
    [J]. MACHINE VISION AND APPLICATIONS, 2011, 22 (02) : 337 - 348
  • [25] Attention-aware Deep Reinforcement Learning for Video Face Recognition
    Rao, Yongming
    Lu, Jiwen
    Zhou, Jie
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3951 - 3960
  • [26] Visual Surveillance using Deep Reinforcement Learning
    Choi, Keong-Hun
    Ha, Jong-Eun
    [J]. 2020 20TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS), 2020, : 289 - 291
  • [27] Learning Discriminative Visual Codebook for Human Action Recognition
    Lei, Qing
    Li, Shao-zi
    Zhang, Hong-bo
    [J]. JOURNAL OF COMPUTERS, 2013, 8 (12) : 3093 - 3102
  • [28] Mask-Attention A3C: Visual Explanation of Action-State Value in Deep Reinforcement Learning
    Itaya, Hidenori
    Hirakawa, Tsubasa
    Yamashita, Takayoshi
    Fujiyoshi, Hironobu
    Sugiura, Komei
    [J]. IEEE ACCESS, 2024, 12 : 86553 - 86571
  • [29] Boxless Action Recognition in Still Images via Recurrent Visual Attention
    Feng, Weijiang
    Zhang, Xiang
    Huang, Xuhui
    Luo, Zhigang
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2017), PT II, 2017, 10635 : 663 - 673
  • [30] Jumping Action Recognition for Figure Skating Video in IoT Using Improved Deep Reinforcement Learning
    Liu, Yu
    Zhou, Ning
    [J]. INFORMATION TECHNOLOGY AND CONTROL, 2023, 52 (02): : 309 - 321