Towards efficient video-based action recognition: context-aware memory attention network

被引:2
|
作者
Koh, Thean Chun [1 ]
Yeo, Chai Kiat [1 ]
Jing, Xuan [1 ,2 ]
Sivadas, Sunil [2 ]
机构
[1] Nanyang Technol Univ, Sch Comp Sci & Engn, 50 Nanyang Ave, Singapore 639798, Singapore
[2] NCS Pte Ltd, Ang Mo Kio St 62, Singapore 569141, Singapore
来源
SN APPLIED SCIENCES | 2023年 / 5卷 / 12期
关键词
Action recognition; Deep learning; Convolutional neural network; Attention; BIDIRECTIONAL LSTM; CLASSIFICATION;
D O I
10.1007/s42452-023-05568-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Given the prevalence of surveillance cameras in our daily lives, human action recognition from videos holds significant practical applications. A persistent challenge in this field is to develop more efficient models capable of real-time recognition with high accuracy for widespread implementation. In this research paper, we introduce a novel human action recognition model named Context-Aware Memory Attention Network (CAMA-Net), which eliminates the need for optical flow extraction and 3D convolution which are computationally intensive. By removing these components, CAMA-Net achieves superior efficiency compared to many existing approaches in terms of computation efficiency. A pivotal component of CAMA-Net is the Context-Aware Memory Attention Module, an attention module that computes the relevance score between key-value pairs obtained from the 2D ResNet backbone. This process establishes correspondences between video frames. To validate our method, we conduct experiments on four well-known action recognition datasets: ActivityNet, Diving48, HMDB51 and UCF101. The experimental results convincingly demonstrate the effectiveness of our proposed model, surpassing the performance of existing 2D-CNN based baseline models.Article HighlightsRecent human action recognition models are not yet ready for practical applications due to high computation needs.We propose a 2D CNN-based human action recognition method to reduce the computation load.The proposed method achieves competitive performance compared to most SOTA 2D CNN-based methods on public datasets.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Play and rewind: Context-aware video temporal action proposals
    Gao, Lianli
    Li, Tao
    Song, Jingkuan
    Zhao, Zhou
    Shen, Heng Tao
    PATTERN RECOGNITION, 2020, 107 (107)
  • [42] Context Sensing Attention Network for Video-based Person Re-identification
    Wang, Kan
    Ding, Changxing
    Pang, Jianxin
    Xu, Xiangmin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (04)
  • [43] Hybrid Dynamic-static Context-aware Attention Network for Action Assessment in Long Videos
    Zeng, Ling-An
    Hong, Fa-Ting
    Zheng, Wei-Shi
    Yu, Qi-Zhi
    Zeng, Wei
    Wang, Yao-Wei
    Lai, Jian-Huang
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2526 - 2534
  • [44] A Context Aware and Video-Based Risk Descriptor for Cyclists
    Costa, Miguel
    Ferreira, Beatriz Quintino
    Marques, Manuel
    2017 IEEE 20TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2017,
  • [45] Temporal Attention Quality Aware Network for Video-based Person Re-Identification
    Xu, Boqin
    Liu, Changhong
    Xue, Shengjun
    Jiang, Aiwen
    Wang, Shimin
    Ye, Jihua
    TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069
  • [46] A Dynamic and Static Context-Aware Attention Network for Trajectory Prediction
    Yu, Jian
    Zhou, Meng
    Wang, Xin
    Pu, Guoliang
    Cheng, Chengqi
    Chen, Bo
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2021, 10 (05)
  • [47] CAAN: Context-Aware attention network for visual question answering
    Chen, Chongqing
    Han, Dezhi
    Chang, Chin-Chen
    Pattern Recognition, 2022, 132
  • [48] Recurrent Region Attention and Video Frame Attention Based Video Action Recognition Network Design
    Sang H.-F.
    Zhao Z.-Y.
    He D.-K.
    Zhao, Zi-Yu (Maikuraky1022@outlook.com), 1600, Chinese Institute of Electronics (48): : 1052 - 1061
  • [49] Context-Aware Attention Network for Image-Text Retrieval
    Zhang, Qi
    Lei, Zhen
    Zhang, Zhaoxiang
    Li, Stan Z.
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 3533 - 3542
  • [50] Context-aware Attention Network for Predicting Image Aesthetic Subjectivity
    Xu, Munan
    Zhong, Jia-Xing
    Ren, Yurui
    Liu, Shan
    Li, Ge
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 798 - 806