Action Recognition by an Attention-Aware Temporal Weighted Convolutional Neural Network

被引:28
|
作者
Wang, Le [1 ]
Zang, Jinliang [1 ]
Zhang, Qilin [2 ]
Niu, Zhenxing [3 ]
Hua, Gang [4 ]
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian 710049, Shaanxi, Peoples R China
[2] HERE Technol, Chicago, IL 60606 USA
[3] Alibaba Grp, Hangzhou 311121, Zhejiang, Peoples R China
[4] Microsoft Res, Redmond, WA 98052 USA
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
action recognition; attention model; convolutional neural netwoks; video-level prediction; temporal weighting;
D O I
10.3390/s18071979
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Research in human action recognition has accelerated significantly since the introduction of powerful machine learning tools such as Convolutional Neural Networks (CNNs). However, effective and efficient methods for incorporation of temporal information into CNNs are still being actively explored in the recent literature. Motivated by the popular recurrent attention models in the research area of natural language processing, we propose the Attention-aware Temporal Weighted CNN (ATW CNN) for action recognition in videos, which embeds a visual attention model into a temporal weighted multi-stream CNN. This attention model is simply implemented as temporal weighting yet it effectively boosts the recognition performance of video representations. Besides, each stream in the proposed ATW CNN framework is capable of end-to-end training, with both network parameters and temporal weights optimized by stochastic gradient descent (SGD) with back-propagation. Our experimental results on the UCF-101 and HMDB-51 datasets showed that the proposed attention mechanism contributes substantially to the performance gains with the more discriminative snippets by focusing on more relevant video segments.
引用
下载
收藏
页数:18
相关论文
共 50 条
  • [31] Attention-aware temporal-spatial graph neural network with multi-sensor information fusion for fault diagnosis
    Wang, Zhe
    Wu, Zhiying
    Li, Xingqiu
    Shao, Haidong
    Han, Te
    Xie, Min
    KNOWLEDGE-BASED SYSTEMS, 2023, 278
  • [32] An Attention-Aware Model for Human Action Recognition on Tree-Based Skeleton Sequences
    Ding, Runwei
    Liu, Chang
    Liu, Hong
    SOCIAL ROBOTICS, ICSR 2018, 2018, 11357 : 569 - 579
  • [33] A Novel Semantic CT Segmentation Algorithm Using Boosted Attention-Aware Convolutional Neural Networks
    Kearney, V.
    Chan, J.
    Wang, T.
    Perry, A.
    Yom, S.
    Solberg, T.
    MEDICAL PHYSICS, 2019, 46 (06) : E370 - E370
  • [34] DACNN: Dynamic Weighted Attention with Multi-channel Convolutional Neural Network for Emotion Recognition
    Yang, Cheng-Ta
    Chen, Yi-Ling
    2020 21ST IEEE INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2020), 2020, : 316 - 321
  • [35] Content-Aware Attention Network for Action Recognition
    Liu, Ziyi
    Wang, Le
    Zheng, Nanning
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2018, 2018, 519 : 109 - 120
  • [36] A Discriminative Convolutional Neural Network with Context-aware Attention
    Zhou, Yuxiang
    Liao, Lejian
    Gao, Yang
    Huang, Heyan
    Wei, Xiaochi
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2020, 11 (05)
  • [37] A Temporal-Aware Relation and Attention Network for Temporal Action Localization
    Zhao, Yibo
    Zhang, Hua
    Gao, Zan
    Guan, Weili
    Nie, Jie
    Liu, Anan
    Wang, Meng
    Chen, Shengyong
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4746 - 4760
  • [38] An Attention Enhanced Spatial-Temporal Graph Convolutional LSTM Network for Action Recognition in Karate
    Guo, Jianping
    Liu, Hong
    Li, Xi
    Xu, Dahong
    Zhang, Yihan
    APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [39] Truncated attention-aware proposal networks with multi-scale dilation for temporal action detection
    Li, Ping
    Cao, Jiachen
    Yuan, Li
    Ye, Qinghao
    Xu, Xianghua
    PATTERN RECOGNITION, 2023, 142
  • [40] Attention-aware Deep Reinforcement Learning for Video Face Recognition
    Rao, Yongming
    Lu, Jiwen
    Zhou, Jie
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3951 - 3960