Residual attention fusion network for video action recognition

被引:0
|
作者
Li, Ao [1 ]
Yi, Yang [1 ,2 ,3 ]
Liang, Daan [1 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510006, Peoples R China
[2] Sun Yat Sen Univ, Xinhua Coll, Guangzhou 510520, Peoples R China
[3] Guangdong Key Lab Big Data Anal & Proc, Guangzhou 510275, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Temporal modeling; Channel-wise attention; Pixel-wise attention; HISTOGRAMS; LSTM;
D O I
10.1016/j.jvcir.2023.103987
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human action recognition in videos is a fundamental and important topic in computer vision, and modeling spatial-temporal dynamics in a video is crucial for action classification. In this paper, a novel attention module named Channel-wise Non-local Attention Module (CNAM) is proposed to highlight the important features both spatially and temporally. Besides, another new attention module named Channel-wise Attention Recalibration Module (CARM) is developed to focus on capturing discriminative features at channel level. Based on these two attention modules, a novel convolutional neural network named Residual Attention Fusion Network (RAFN) is proposed to model long-range temporal structure and capture more discriminative action features at the same time. More specifically, first, a sparse temporal sampling strategy is adopted to uniformly sample video data as input to RAFN along the temporal dimension. Secondly, the attention modules CNAM and CARM are plugged into residual network for highlighting important action regions around actors. Finally, the classification scores of four streams of RAFN are combined by late fusion. The experimental results on HMDB51 and UCF101 demonstrate the effectiveness and excellent recognition performance of our proposed method.
引用
收藏
页数:10
相关论文
共 50 条
  • [41] Multi-Modal Fusion Sign Language Recognition Based on Residual Network and Attention Mechanism
    Chu Chaoqin
    Xiao Qinkun
    Zhang Yinhuan
    Xing, Liu
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2022, 36 (12)
  • [42] Video-based action recognition using spurious-3D residual attention networks
    Chen, Bo
    Tang, Hongying
    Zhang, Zebin
    Tong, Guanjun
    Li, Baoqing
    IET IMAGE PROCESSING, 2022, 16 (11) : 3097 - 3111
  • [43] Video inpainting based on residual convolution attention network
    Li De-cai
    Yan Qun
    Yao Jian-min
    Lin Zhi-xian
    Dong Ze-yu
    CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2022, 37 (01) : 86 - 96
  • [44] A Face Recognition Method for Sports Video Based on Feature Fusion and Residual Recurrent Neural Network
    Yan, Xu
    Informatica (Slovenia), 2024, 48 (12): : 137 - 152
  • [45] Shrinking Temporal Attention in Transformers for Video Action Recognition
    Li, Bonan
    Xiong, Pengfei
    Han, Congying
    Guo, Tiande
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1263 - 1271
  • [46] Integrating Temporal and Spatial Attention for Video Action Recognition
    Zhou, Yuanding
    Li, Baopu
    Wang, Zhihui
    Li, Haojie
    SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [47] Basketball Action Recognition Method of Deep Neural Network Based on Dynamic Residual Attention Mechanism
    Xiao, Jiongen
    Tian, Wenchun
    Ding, Liping
    INFORMATION, 2023, 14 (01)
  • [48] Deep Attention Network for Egocentric Action Recognition
    Lu, Minlong
    Li, Ze-Nian
    Wang, Yueming
    Pan, Gang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (08) : 3703 - 3713
  • [49] Dual attention convolutional network for action recognition
    Li, Xiaoqiang
    Xie, Miao
    Zhang, Yin
    Ding, Guangtai
    Tong, Weiqin
    IET IMAGE PROCESSING, 2020, 14 (06) : 1059 - 1065
  • [50] Attention-based network for effective action recognition from multi-view video
    Hoang-Thuyen Nguyen
    Thi-Oanh Nguyen
    KNOWLEDGE-BASED AND INTELLIGENT INFORMATION & ENGINEERING SYSTEMS (KSE 2021), 2021, 192 : 971 - 980