A survey on deep learning-based spatio-temporal action detection

被引:1
|
作者
Wang, Peng [1 ]
Zeng, Fanwei [2 ]
Qian, Yuntao [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou 310007, Zhejiang, Peoples R China
[2] Ant Grp, Hangzhou 310007, Zhejiang, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Computer vision; deep learning; spatio-temporal action detection; SEARCH;
D O I
10.1142/S0219691323500662
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Spatio-temporal action detection (STAD) aims to classify the actions present in a video and localize them in space and time. It has become a particularly active area of research in computer vision because of its explosively emerging real-world applications, such as autonomous driving, visual surveillance and entertainment. Many efforts have been devoted in recent years to build a robust and effective framework for STAD. This paper provides a comprehensive review of the state-of-the-art deep learning-based methods for STAD. First, a taxonomy is developed to organize these methods. Next, the linking algorithms, which aim to associate the frame- or clip-level detection results together to form action tubes, are reviewed. Then, the commonly used benchmark datasets and evaluation metrics are introduced, and the performance of state-of-the-art models is compared. At last, this paper is concluded, and a set of potential research directions of STAD are discussed.
引用
收藏
页数:35
相关论文
共 50 条
  • [21] XAI4EEG: spectral and spatio-temporal explanation of deep learning-based seizure detection in EEG time series
    Dominik Raab
    Andreas Theissler
    Myra Spiliopoulou
    Neural Computing and Applications, 2023, 35 : 10051 - 10068
  • [22] Learning to track for spatio-temporal action localization
    Weinzaepfel, Philippe
    Harchaoui, Zaid
    Schmid, Cordelia
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 3164 - 3172
  • [23] LEARNING SPATIO-TEMPORAL DEPENDENCIES FOR ACTION RECOGNITION
    Cai, Qiao
    Yin, Yafeng
    Man, Hong
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 3740 - 3744
  • [24] Online Spatio-temporal Action Detection for Eldercare
    Koh, Thean Chun
    Yeo, Chai Kiat
    Jing, Xuan
    2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 126 - 127
  • [25] An Attentive Spatio-Temporal Learning-Based Network for Cardiovascular Disease Diagnosis
    Jyotishi, Debasish
    Dandapat, Samarendra
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2023, 53 (08): : 4661 - 4671
  • [26] Learning-Based Distributed Spatio-Temporal k Nearest Neighbors Join
    Li, Ruiyuan
    Li, Jiajun
    Zhou, Minxin
    Wang, Rubin
    He, Huajun
    Chen, Chao
    Bao, Jie
    Zheng, Yu
    IEEE TRANSACTIONS ON BIG DATA, 2025, 11 (02) : 861 - 878
  • [27] A Spatio-Temporal Deep Learning Approach For Human Action Recognition in Infrared Videos
    Shah, Anuj K.
    Ghosh, Ripul
    Akula, Aparna
    OPTICS AND PHOTONICS FOR INFORMATION PROCESSING XII, 2018, 10751
  • [28] Human Action Recognition by Learning Spatio-Temporal Features With Deep Neural Networks
    Wang, Lei
    Xu, Yangyang
    Cheng, Jun
    Xia, Haiying
    Yin, Jianqin
    Wu, Jiaji
    IEEE ACCESS, 2018, 6 : 17913 - 17922
  • [29] Action recognition method of spatio-temporal feature fusion deep learning network
    Pei, Xiaomin
    Fan, Huijie
    Tang, Yandong
    Hongwai yu Jiguang Gongcheng/Infrared and Laser Engineering, 2018, 47 (02):
  • [30] UP-Net: a generic deep learning-based time stepper for parameterized spatio-temporal dynamics
    Stender, Merten
    Ohlsen, Jakob
    Geisler, Hendrik
    Chabchoub, Amin
    Hoffmann, Norbert
    Schlaefer, Alexander
    COMPUTATIONAL MECHANICS, 2023, 71 (06) : 1227 - 1249