An attention mechanism based convolutional LSTM network for video action recognition

被引:0
|
作者
Hongwei Ge
Zehang Yan
Wenhao Yu
Liang Sun
机构
[1] Dalian University of Technology,College of Computer Science and Technology
来源
关键词
Attention mechanism; Convolutional LSTM; Spatial transformer; Video action recognition;
D O I
暂无
中图分类号
学科分类号
摘要
As an important issue in video classification, human action recognition is becoming a hot topic in computer vision. The ways of effectively representing the spatial static and temporal dynamic information of videos are important problems in video action recognition. This paper proposes an attention mechanism based convolutional LSTM action recognition algorithm to improve the accuracy of recognition by extracting the salient regions of actions in videos effectively. First, GoogleNet is used to extract the features of video frames. Then, those feature maps are processed by the spatial transformer network for the attention. Finally the sequential information of the features is modeled via the convolutional LSTM to classify the action in the original video. To accelerate the training speed, we adopt the analysis of temporal coherence to reduce the redundant features extracted by GoogleNet with trivial accuracy loss. In comparison with the state-of-the-art algorithms for video action recognition, competitive results are achieved on three widely-used datasets, UCF-11, HMDB-51 and UCF-101. Moreover, by using the analysis of temporal coherence, desirable results are obtained while the training time is reduced.
引用
收藏
页码:20533 / 20556
页数:23
相关论文
共 50 条
  • [31] Correlational Convolutional LSTM for human action recognition
    Majd, Mahshid
    Safabakhsh, Reza
    NEUROCOMPUTING, 2020, 396 : 224 - 229
  • [32] Multipath Attention and Adaptive Gating Network for Video Action Recognition
    Haiping Zhang
    Zepeng Hu
    Dongjin Yu
    Liming Guan
    Xu Liu
    Conghao Ma
    Neural Processing Letters, 56
  • [33] Convolutional neural network based on attention mechanism and Bi-LSTM for bearing remaining life prediction
    Luo, Jiahang
    Zhang, Xu
    APPLIED INTELLIGENCE, 2022, 52 (01) : 1076 - 1091
  • [34] SDAN: Stacked Diverse Attention Network for Video Action Recognition
    Zhu, Xiaoguang
    Huang, Siran
    Fan, Wenjing
    Cheng, Yuhao
    Shao, Huaqing
    Liu, Peilin
    2021 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2021,
  • [35] Multipath Attention and Adaptive Gating Network for Video Action Recognition
    Zhang, Haiping
    Hu, Zepeng
    Yu, Dongjin
    Guan, Liming
    Liu, Xu
    Ma, Conghao
    NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [36] Optimized Convolutional Neural Network Recognition for Athletes' Pneumonia Image Based on Attention Mechanism
    Zhang, Hui
    Ma, Ruipu
    Zhao, Yingao
    Zhang, Qianqian
    Sun, Quandang
    Ma, Yuanyuan
    ENTROPY, 2022, 24 (10)
  • [37] Convolutional Attention Based Mechanism for Facial Microexpression Recognition
    Talib, Hafiz Khizer Bin
    Xu, Kaiwei
    Cao, Yanlong
    Xu, Yuan-Ping
    Xu, Zhijie
    Zaman, Muhammad
    Akhunzada, Adnan
    IEEE ACCESS, 2025, 13 : 23732 - 23747
  • [38] Tomato leaf disease recognition based on improved convolutional neural network with attention mechanism
    Ni, Jiangong
    Zhou, Zhigang
    Zhao, Yifan
    Han, Zhongzhi
    Zhao, Longgang
    PLANT PATHOLOGY, 2023, 72 (07) : 1335 - 1344
  • [39] Sensors-based Human Activity Recognition with Convolutional Neural Network and Attention Mechanism
    Zhang, Wenbo
    Zhu, Tao
    Yang, Congmin
    Xiao, Jiyi
    Ning, Huansheng
    PROCEEDINGS OF 2020 IEEE 11TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2020), 2020, : 158 - 162
  • [40] Script identification in natural scene image and video frames using an attention based Convolutional-LSTM network
    Bhunia, Ankan Kumar
    Konwer, Aishik
    Bhunia, Ayan Kumar
    Bhowmick, Abir
    Roy, Partha P.
    Pal, Umapada
    PATTERN RECOGNITION, 2019, 85 : 172 - 184