Dilated Multi-Temporal Modeling for Action Recognition

被引:0
|
作者
Zhang, Tao [1 ]
Wu, Yifan [1 ]
Li, Xiaoqiang [1 ]
机构
[1] Shanghai Univ, Sch Comp Engn & Sci, Shanghai 200444, Peoples R China
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 12期
关键词
computer vision; action recognition; multiple temporal modeling; dilated convolution;
D O I
10.3390/app13126934
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Action recognition involves capturing temporal information from video clips where the duration varies with videos for the same action. Due to the diverse scale of temporal context, uniform size kernels utilized in convolutional neural networks (CNNs) limit the capability of multiple-scale temporal modeling. In this paper, we propose a novel dilated multi-temporal (DMT) module that provides a solution for modeling multi-temporal information in action recognition. By using dilated convolutions with different dilation rates in different feature map channels, the DMT module captures information at multiple scales without the need for costly multi-branch networks, input-level frame pyramids, or feature map stacking that previous works have usually incurred. Therefore, this approach enables the integration of temporal information from multiple scales. In addition, the DMT module can be integrated into existing 2D CNNs, making it a straightforward and intuitive solution for addressing the challenge of multi-temporal modeling. Our proposed method has demonstrated promising results in performance and has achieved about 2% and 1% accuracy improvement on FineGym99 and SthV1. We conducted an empirical analysis that demonstrates how DMT improves the classification accuracy for action classes with varying durations.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Multi-Temporal Convolutions for Human Action Recognition in Videos
    Stergiou, Alexandros
    Poppe, Ronald
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [2] Multi-Level Temporal Dilated Dense Prediction for Action Recognition
    Wang, Jinpeng
    Lin, Yiqi
    Zhang, Manlin
    Gao, Yuan
    Ma, Andy J.
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2553 - 2566
  • [3] DenseGCN: A multi-level and multi-temporal graph convolutional network for action recognition
    Yu, Chengzhang
    Bao, Wenxia
    IET IMAGE PROCESSING, 2023, 17 (12) : 3401 - 3410
  • [4] 3D ACTION RECOGNITION USING MULTI-TEMPORAL SKELETON VISUALIZATION
    Liu, Mengyuan
    Chen, Chen
    Meng, Fanyang
    Liu, Hong
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
  • [5] Action Recognition Using Multi-Temporal DMMs Based on Adaptive Vague Division
    Jiang, Min
    Jin, Ke
    Kong, Jun
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON IMAGE AND GRAPHICS PROCESSING (ICIGP 2018), 2018, : 8 - 13
  • [6] Multi-view region-adaptive multi-temporal DMM and RGB action recognition
    Mahmoud Al-Faris
    John P. Chiverton
    Yanyan Yang
    David Ndzi
    Pattern Analysis and Applications, 2020, 23 : 1587 - 1602
  • [7] Multi-view region-adaptive multi-temporal DMM and RGB action recognition
    Al-Faris, Mahmoud
    Chiverton, John P.
    Yang, Yanyan
    Ndzi, David L.
    PATTERN ANALYSIS AND APPLICATIONS, 2020, 23 (04) : 1587 - 1602
  • [8] Statistical HOG on Multi-temporal Depth Motion Maps Approach for Human Action Recognition
    Ali, Heba Hamdy
    Youssif, Aliaa A. A.
    Moftah, Hossam M.
    PROCEEDINGS OF THE XX INTERNATIONAL CONFERENCE ON HUMAN-COMPUTER INTERACTION (INTERACCION'2019), 2019,
  • [9] Temporal Modeling on Multi-Temporal-Scale Spatiotemporal Atoms for Action Recognition
    Yao, Guangle
    Lei, Tao
    Liu, Xianyuan
    Jiang, Ping
    APPLIED SCIENCES-BASEL, 2018, 8 (10):
  • [10] MRTP:Multi-Temporal Resolution Real-Time Action Recognition Approach by Time-Action Perception
    Zhang K.
    Yang J.
    Zhang D.
    Chen Y.
    Li J.
    Du S.
    Hsi-An Chiao Tung Ta Hsueh/Journal of Xi'an Jiaotong University, 2022, 56 (03): : 22 - 32