DUAL TEMPORAL TRANSFORMERS FOR FINE-GRAINED DANGEROUS ACTION RECOGNITION

被引:0
|
作者
Song, Wenfeng [1 ]
Jin, Xingliang [1 ]
Ding, Yang [1 ]
Gao, Yang [2 ,3 ]
Hou, Xia [1 ]
机构
[1] Beijing Informat Sci & Technol Univ, Comp Sch, Beijing, Peoples R China
[2] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing, Peoples R China
[3] Chinese Acad Med Sci, Res Unit Virtual Human & Virtual Surg 2019RU004, Beijing, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Fine-grained Dangerous Action Recognition; Temporal Transformer; Action Recognition;
D O I
10.1109/ICIP49359.2023.10222886
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognizing dangerous actions is a critical task in computer vision, especially for surveillance applications. While existing deep learning methods have been successful in confined environments, they struggle with the anomalous and salient variations of human postures in dangerous actions. Additionally, finer-grained dangerous actions require more discriminative cues, adding to the complexity of the task. To address these challenges, we propose a novel solution that models the intrinsic and invariant properties of dangerous actions at multiple temporal semantic levels. Concretely, we propose a Dual Temporal Transformers (DTT) to capture temporal interactions between distinct key points in the human body aggregation from shallow to deep layers, increasing the perception field from local to global, simultaneously. By doing so, our method avoids overfitting to unrelated or minor clues in videos and achieves a generalized representation of abnormal actions. We evaluate our approach on indoor and outdoor environments and found that DTT outperforms existing methods in terms of efficiency and accuracy. Our code and dataset are pubic available on https://github.com/AveryJohnsonJJ/DTT.git.
引用
收藏
页码:415 / 419
页数:5
相关论文
共 50 条
  • [1] Fine-Grained Action Recognition Based on Temporal Pyramid Excitation Network
    Zhou, Xuan
    Yi, Jianping
    [J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 2103 - 2116
  • [2] Temporal and Fine-Grained Pedestrian Action Recognition on Driving Recorder Database
    Kataoka, Hirokatsu
    Satoh, Yutaka
    Aoki, Yoshimitsu
    Oikawa, Shoko
    Matsui, Yasuhiro
    [J]. SENSORS, 2018, 18 (02)
  • [3] Fine-Grained Crowdsourcing for Fine-Grained Recognition
    Jia Deng
    Krause, Jonathan
    Li Fei-Fei
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 580 - 587
  • [4] TaiChi: A Fine-Grained Action Recognition Dataset
    Sun, Shan
    Wang, Feng
    Liang, Qi
    He, Liang
    [J]. PROCEEDINGS OF THE 2017 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR'17), 2017, : 434 - 438
  • [5] Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition
    Li, Tianjiao
    Foo, Lin Geng
    Ke, Qiuhong
    Rahmani, Hossein
    Wang, Anran
    Wang, Jinghua
    Liu, Jun
    [J]. COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 386 - 403
  • [6] CTM: Cross-time temporal module for fine-grained action recognition
    Qian, Huifang
    Zhang, Jialun
    Yi, Jianping
    Shi, Zhenyu
    Zhang, Yimin
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 244
  • [7] Learning Convolutional Action Primitives for Fine-grained Action Recognition
    Lea, Colin
    Vidal, Rene
    Hager, Gregory D.
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2016, : 1642 - 1649
  • [8] Convolutional transformer network for fine-grained action recognition
    Ma, Yujun
    Wang, Ruili
    Zong, Ming
    Ji, Wanting
    Wang, Yi
    Ye, Baoliu
    [J]. NEUROCOMPUTING, 2024, 569
  • [9] FINE-GRAINED ACTION RECOGNITION ON A NOVEL BASKETBALL DATASET
    Gu, Xiaofan
    Xue, Xinwei
    Wang, Feng
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2563 - 2567
  • [10] Fine-grained Action Recognition using Attribute Vectors
    Yenduri, Sravani
    Perveen, Nazil
    Chalavadi, Vishnu
    Mohan, C. Krishna
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 134 - 143