Action Recognition by an Attention-Aware Temporal Weighted Convolutional Neural Network

被引:28
|
作者
Wang, Le [1 ]
Zang, Jinliang [1 ]
Zhang, Qilin [2 ]
Niu, Zhenxing [3 ]
Hua, Gang [4 ]
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian 710049, Shaanxi, Peoples R China
[2] HERE Technol, Chicago, IL 60606 USA
[3] Alibaba Grp, Hangzhou 311121, Zhejiang, Peoples R China
[4] Microsoft Res, Redmond, WA 98052 USA
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
action recognition; attention model; convolutional neural netwoks; video-level prediction; temporal weighting;
D O I
10.3390/s18071979
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Research in human action recognition has accelerated significantly since the introduction of powerful machine learning tools such as Convolutional Neural Networks (CNNs). However, effective and efficient methods for incorporation of temporal information into CNNs are still being actively explored in the recent literature. Motivated by the popular recurrent attention models in the research area of natural language processing, we propose the Attention-aware Temporal Weighted CNN (ATW CNN) for action recognition in videos, which embeds a visual attention model into a temporal weighted multi-stream CNN. This attention model is simply implemented as temporal weighting yet it effectively boosts the recognition performance of video representations. Besides, each stream in the proposed ATW CNN framework is capable of end-to-end training, with both network parameters and temporal weights optimized by stochastic gradient descent (SGD) with back-propagation. Our experimental results on the UCF-101 and HMDB-51 datasets showed that the proposed attention mechanism contributes substantially to the performance gains with the more discriminative snippets by focusing on more relevant video segments.
引用
下载
收藏
页数:18
相关论文
共 50 条
  • [21] Attention-Aware Invertible Hashing Network
    Li, Shanshan
    Cai, Qiang
    Li, Zhuangzi
    Li, Haisheng
    Zhang, Naiguang
    Cao, Jian
    IMAGE AND GRAPHICS, ICIG 2019, PT III, 2019, 11903 : 409 - 420
  • [22] Attention-aware fully convolutional neural network with convolutional long short-term memory network for ultrasound-based motion tracking
    Huang, Pu
    Yu, Gang
    Lu, Hua
    Liu, Danhua
    Xing, Ligang
    Yin, Yong
    Kovalchuk, Nataliya
    Xing, Lei
    Li, Dengwang
    MEDICAL PHYSICS, 2019, 46 (05) : 2275 - 2285
  • [23] Ultrasound-Based Motion Tracking Using Attention-Aware Convolutional Neural Network and Convolutional Long Short-Term Memory Network
    Huang, P.
    Yu, G.
    Lu, H.
    Liu, D.
    Xing, L.
    Yin, Y.
    Kovalchuk, N.
    Xing, L.
    Li, D.
    MEDICAL PHYSICS, 2019, 46 (06) : E133 - E134
  • [24] Dual attention convolutional network for action recognition
    Li, Xiaoqiang
    Xie, Miao
    Zhang, Yin
    Ding, Guangtai
    Tong, Weiqin
    IET IMAGE PROCESSING, 2020, 14 (06) : 1059 - 1065
  • [25] Neural Attention-Aware Hierarchical Topic Model
    Jin, Yuan
    Zhao, He
    Liu, Ming
    Du, Lan
    Buntine, Wray
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1042 - 1052
  • [26] Attention-Aware Network with Latent Semantic Analysis for Clothing Invariant Gait Recognition
    Ling, Hefei
    Wu, Jia
    Li, Ping
    Shen, Jialie
    CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 60 (03): : 1041 - 1054
  • [27] Attention-Aware Sobel Graph Convolutional Network for Remote Sensing Image Change Detection
    Wang, Lei
    You, Zhi-Hui
    Lu, Wei
    Chen, Si-Bao
    Tang, Jin
    Luo, Bin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [28] Spatial-temporal pyramid based Convolutional Neural Network for action recognition
    Zheng, Zhenxing
    An, Gaoyun
    Wu, Dapeng
    Ruan, Qiuqi
    NEUROCOMPUTING, 2019, 358 : 446 - 455
  • [29] Temporal Pyramid Pooling-Based Convolutional Neural Network for Action Recognition
    Wang, Peng
    Cao, Yuanzhouhan
    Shen, Chunhua
    Liu, Lingqiao
    Shen, Heng Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (12) : 2613 - 2622
  • [30] Attention-aware concentrated network for saliency prediction
    Li, Pengqian
    Xing, Xiaofen
    Xu, Xiangmin
    Cai, Bolun
    Cheng, Jun
    NEUROCOMPUTING, 2021, 429 (429) : 199 - 214