Spatio-Temporal Action Detector with Self-Attention

被引:2
|
作者
Ma, Xurui [1 ]
Luo, Zhigang [1 ,2 ]
Zhang, Xiang [1 ,3 ,4 ]
Liao, Qing [5 ]
Shen, Xingyu [1 ]
Wang, Mengzhu [1 ]
机构
[1] Natl Univ Def Technol, Coll Comp, Changsha 410073, Peoples R China
[2] Natl Univ Def Technol, Sci & Technol Parallel & Distributed Lab, Changsha 410073, Hunan, Peoples R China
[3] Natl Univ Def Technol, Inst Quantum, Changsha 410073, Hunan, Peoples R China
[4] Natl Univ Def Technol, State Key Lab High Performance Comp, Changsha 410073, Hunan, Peoples R China
[5] Harbin Inst Technol Shenzhen, Dept Comp Sci & Technol, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Spatio-temporal action detection; self-attention; tubelets link algorithm;
D O I
10.1109/IJCNN52387.2021.9533300
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the field of spatio-temporal action detection, some current studies attempt to solve the problem of action detection by using the one-stage object detectors based on anchor-free. Albeit efficiency, more performance boosts are expected. Towards this goal, a Self-Attention MovingCenter Detector (SAMOC) is proposed, which is blessed with two attractive aspects: 1) to effectively capture motion cues, a spatio-temporal self-attention block is explored to reinforce feature representation by aggregating motion-dependent global contexts, and 2) a link branch serves to model the frame-level object dependency, which promotes the confidence scores of correct actions. Experiments on two benchmark datasets show that SAMOC with the proposed two aspects achieves the state-of-the-art and works in real-time as well.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] SPATIO-TEMPORAL SLOWFAST SELF-ATTENTION NETWORK FOR ACTION RECOGNITION
    Kim, Myeongjun
    Kim, Taehun
    Kim, Daijin
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2206 - 2210
  • [2] Spatio-Temporal Self-Attention Weighted VLAD Neural Network for Action Recognition
    Cheng, Shilei
    Xie, Mei
    Ma, Zheng
    Li, Siqi
    Gu, Song
    Yang, Feng
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (01) : 220 - 224
  • [3] Dual Stream Spatio-Temporal Motion Fusion With Self-Attention For Action Recognition
    Jalal, Md Asif
    Aftab, Waqas
    Moore, Roger K.
    Mihaylova, Lyudmila
    [J]. 2019 22ND INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION 2019), 2019,
  • [4] Spatio-Temporal 3D Action Recognition with Hierarchical Self-Attention Mechanism
    Araei, Soheil
    Nadian-Ghomsheh, Ali
    [J]. 2021 26TH INTERNATIONAL COMPUTER CONFERENCE, COMPUTER SOCIETY OF IRAN (CSICC), 2021,
  • [5] Spatio-Temporal Self-Attention Network for Video Saliency Prediction
    Wang, Ziqiang
    Liu, Zhi
    Li, Gongyang
    Wang, Yang
    Zhang, Tianhong
    Xu, Lihua
    Wang, Jijun
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1161 - 1174
  • [6] Transforming spatio-temporal self-attention using action embedding for skeleton-based action recognition
    Ahmad, Tasweer
    Rizvi, Syed Tahir Hussain
    Kanwal, Neel
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
  • [7] Self-Attention Graph Convolution Imputation Network for Spatio-Temporal Traffic Data
    Wei, Xiulan
    Zhang, Yong
    Wang, Shaofan
    Zhao, Xia
    Hu, Yongli
    Yin, Baocai
    [J]. IEEE Transactions on Intelligent Transportation Systems, 2024, 25 (12) : 19549 - 19562
  • [8] Spatio-Temporal Self-Attention Network for Fire Detection and Segmentation in Video Surveillance
    Shahid, Mohammad
    Virtusio, John Jethro
    Wu, Yu-Hsien
    Chen, Yung-Yao
    Tanveer, M.
    Muhammad, Khan
    Hua, Kai-Lung
    [J]. IEEE ACCESS, 2022, 10 : 1259 - 1275
  • [9] A Class Balanced Spatio-Temporal Self-Attention Model for Combat Intention Recognition
    Wang, Xuan
    Jin, Benzhou
    Jia, Mingyang
    Wu, Gang
    Zhang, Xiaofei
    [J]. IEEE ACCESS, 2024, 12 : 112074 - 112084
  • [10] Action Tubelet Detector for Spatio-Temporal Action Localization
    Kalogeiton, Vicky
    Weinzaepfel, Philippe
    Ferrari, Vittorio
    Schmid, Cordelia
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4415 - 4423