Action Recognition by an Attention-Aware Temporal Weighted Convolutional Neural Network

被引:28
|
作者
Wang, Le [1 ]
Zang, Jinliang [1 ]
Zhang, Qilin [2 ]
Niu, Zhenxing [3 ]
Hua, Gang [4 ]
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian 710049, Shaanxi, Peoples R China
[2] HERE Technol, Chicago, IL 60606 USA
[3] Alibaba Grp, Hangzhou 311121, Zhejiang, Peoples R China
[4] Microsoft Res, Redmond, WA 98052 USA
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
action recognition; attention model; convolutional neural netwoks; video-level prediction; temporal weighting;
D O I
10.3390/s18071979
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Research in human action recognition has accelerated significantly since the introduction of powerful machine learning tools such as Convolutional Neural Networks (CNNs). However, effective and efficient methods for incorporation of temporal information into CNNs are still being actively explored in the recent literature. Motivated by the popular recurrent attention models in the research area of natural language processing, we propose the Attention-aware Temporal Weighted CNN (ATW CNN) for action recognition in videos, which embeds a visual attention model into a temporal weighted multi-stream CNN. This attention model is simply implemented as temporal weighting yet it effectively boosts the recognition performance of video representations. Besides, each stream in the proposed ATW CNN framework is capable of end-to-end training, with both network parameters and temporal weights optimized by stochastic gradient descent (SGD) with back-propagation. Our experimental results on the UCF-101 and HMDB-51 datasets showed that the proposed attention mechanism contributes substantially to the performance gains with the more discriminative snippets by focusing on more relevant video segments.
引用
下载
收藏
页数:18
相关论文
共 50 条
  • [1] Attention-Based Temporal Weighted Convolutional Neural Network for Action Recognition
    Zang, Jinliang
    Wang, Le
    Liu, Ziyi
    Zhang, Qilin
    Niu, Zhenxing
    Hua, Gang
    Zheng, Nanning
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2018, 2018, 519 : 97 - 108
  • [2] STAP: Spatial-Temporal Attention-Aware Pooling for Action Recognition
    Nguyen, Tam V.
    Song, Zheng
    Yan, Shuicheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2015, 25 (01) : 77 - 86
  • [3] Attention-Aware Heterogeneous Graph Neural Network
    Jintao Zhang
    Quan Xu
    Big Data Mining and Analytics, 2021, 4 (04) : 233 - 241
  • [4] Attention-Aware Heterogeneous Graph Neural Network
    Zhang, Jintao
    Xu, Quan
    BIG DATA MINING AND ANALYTICS, 2021, 4 (04) : 233 - 241
  • [5] Attention-Aware Convolutional Neural Network for Age-Related Macular Degeneration Classification
    Li, Shanshan
    Quan, Zhi
    2020 12TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN 2020), 2020, : 264 - 269
  • [6] Attention-Aware Multi-Task Convolutional Neural Networks
    Lyu, Kejie
    Li, Yingming
    Zhang, Zhongfei
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 1867 - 1878
  • [7] Spatial-Temporal Convolutional Attention Network for Action Recognition
    Luo, Huilan
    Chen, Han
    Computer Engineering and Applications, 2023, 59 (09): : 150 - 158
  • [8] Skeleton-based attention-aware spatial-temporal model for action detection and recognition
    Cui, Ran
    Zhu, Aichun
    Wu, Jingran
    Hua, Gang
    IET COMPUTER VISION, 2020, 14 (05) : 177 - 184
  • [9] Attention-Aware Pseudo-3-D Convolutional Neural Network for Hyperspectral Image Classification
    Lin, Jianzhe
    Mou, Lichao
    Zhu, Xiao Xiang
    Ji, Xiangyang
    Wang, Z. Jane
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (09): : 7790 - 7802
  • [10] Human-centered attention-aware networks for action recognition
    Liu, Shuai
    Li, Yating
    Fu, Weina
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (12) : 10968 - 10987