Cascading spatio-temporal attention network for real-time action detection

被引:0
|
作者
Jianhua Yang
Ke Wang
Ruifeng Li
Petra Perner
机构
[1] Harbin Institute of Technology,State Key Laboratory of Robotics and System
[2] Harbin Institute of Technology,Zhengzhou Research Institute
[3] FutureLab Artificial Intelligence IBaI-2,undefined
来源
关键词
Spatio-temporal action detection; Human behavior analysis; Spatio-temporal attention;
D O I
暂无
中图分类号
学科分类号
摘要
Accurately detecting human actions in video has many applications, such as video surveillance and somatosensory games. In this paper, we propose a spatial-aware attention module (SAM) and a temporal-aware attention module (TAM) for spatio-temporal action detection in videos. SAM first concatenates the feature maps of consecutive frames on the channel and then uses dilated convolutional layer followed by a sigmoid function to generate a spatial attention map. The resulting attention map contains spatial information from consecutive frames, so it helps the detector focus on salient spatial features to achieve more accurate localization of action instances in consecutive frames. TAM deploys several fully connected layers to generate a temporal attention map. The temporal attention map focuses on the temporal association of each spatial feature; it can capture the temporal association of action instances, thereby improving the detector to track actions. To evaluate the effectiveness of SAM and TAM, we build an efficient and strong anchor-free action detector, cascading spatio-temporal attention network, equipped with a 2D backbone and SAM and TAM modules. Extensive experiments on two benchmarks, JHMDB and UCF101-24, demonstrate the preferable performance of SAM and TAM.
引用
收藏
相关论文
共 50 条
  • [1] Cascading spatio-temporal attention network for real-time action detection
    Yang, Jianhua
    Wang, Ke
    Li, Ruifeng
    Perner, Petra
    [J]. MACHINE VISION AND APPLICATIONS, 2023, 34 (06)
  • [2] Real-Time Action Detection Based on Spatio-Temporal Interaction Perception
    Ke, Xiao
    Miao, Xin
    Guo, Wen-Zhong
    [J]. Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (02): : 574 - 588
  • [3] Real-time Online Action Detection Forests using Spatio-temporal Contexts
    Baek, Seungryul
    Kim, Kwang In
    Kim, Tae-Kyun
    [J]. 2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), 2017, : 158 - 167
  • [4] Real-time Spatio-Temporal Action Localization in 360 Videos
    Chen, Bo
    Ali-Eldin, Ahmed
    Shenoy, Prashant
    Nahrsted, Klara
    [J]. 2020 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM 2020), 2020, : 73 - 76
  • [5] Learning motion representation for real-time spatio-temporal action localization
    Zhang, Dejun
    He, Linchao
    Tu, Zhigang
    Zhang, Shifu
    Han, Fei
    Yang, Boxiong
    [J]. PATTERN RECOGNITION, 2020, 103
  • [6] Spatio-Temporal Attention Networks for Action Recognition and Detection
    Li, Jun
    Liu, Xianglong
    Zhang, Wenxuan
    Zhang, Mingyuan
    Song, Jingkuan
    Sebe, Nicu
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (11) : 2990 - 3001
  • [7] LEARNING SPATIO-TEMPORAL CONVOLUTIONAL NETWORK FOR REAL-TIME OBJECT TRACKING
    Chen, Hanzao
    Xing, Xiaofen
    Xu, Xiangmin
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2153 - 2157
  • [8] Spatio-temporal view interpolation in real-time
    Radtke, T
    [J]. VISUAL COMMUNICATIONS AND IMAGE PROCESSING 2003, PTS 1-3, 2003, 5150 : 1939 - 1946
  • [9] Real-time spatio-temporal event detection on geotagged social media
    George, Yasmeen
    Karunasekera, Shanika
    Harwood, Aaron
    Lim, Kwan Hui
    [J]. JOURNAL OF BIG DATA, 2021, 8 (01)
  • [10] Real-time spatio-temporal event detection on geotagged social media
    Yasmeen George
    Shanika Karunasekera
    Aaron Harwood
    Kwan Hui Lim
    [J]. Journal of Big Data, 8