A survey on deep learning-based spatio-temporal action detection

被引:1
|
作者
Wang, Peng [1 ]
Zeng, Fanwei [2 ]
Qian, Yuntao [1 ]
机构
[1] Zhejiang Univ, Coll Comp Sci, Hangzhou 310007, Zhejiang, Peoples R China
[2] Ant Grp, Hangzhou 310007, Zhejiang, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Computer vision; deep learning; spatio-temporal action detection; SEARCH;
D O I
10.1142/S0219691323500662
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Spatio-temporal action detection (STAD) aims to classify the actions present in a video and localize them in space and time. It has become a particularly active area of research in computer vision because of its explosively emerging real-world applications, such as autonomous driving, visual surveillance and entertainment. Many efforts have been devoted in recent years to build a robust and effective framework for STAD. This paper provides a comprehensive review of the state-of-the-art deep learning-based methods for STAD. First, a taxonomy is developed to organize these methods. Next, the linking algorithms, which aim to associate the frame- or clip-level detection results together to form action tubes, are reviewed. Then, the commonly used benchmark datasets and evaluation metrics are introduced, and the performance of state-of-the-art models is compared. At last, this paper is concluded, and a set of potential research directions of STAD are discussed.
引用
收藏
页数:35
相关论文
共 50 条
  • [31] Deep learning-based spatio-temporal prediction and uncertainty assessment of urban PM2.5 distribution
    Liu H.
    Zhang C.
    Chen K.
    Deng M.
    Peng C.
    Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2024, 53 (04): : 750 - 760
  • [32] A Survey of Deep Learning-Based Object Detection
    Jiao, Licheng
    Zhang, Fan
    Liu, Fang
    Yang, Shuyuan
    Li, Lingling
    Feng, Zhixi
    Qu, Rong
    IEEE ACCESS, 2019, 7 : 128837 - 128868
  • [33] Deep Learning-Based Crack Detection: A Survey
    Nguyen, Son Dong
    Tran, Thai Son
    Tran, Van Phuc
    Lee, Hyun Jong
    Piran, Md. Jalil
    Le, Van Phuc
    INTERNATIONAL JOURNAL OF PAVEMENT RESEARCH AND TECHNOLOGY, 2023, 16 (04) : 943 - 967
  • [34] Deep Learning-Based Crack Detection: A Survey
    Son Dong Nguyen
    Thai Son Tran
    Van Phuc Tran
    Hyun Jong Lee
    Md. Jalil Piran
    Van Phuc Le
    International Journal of Pavement Research and Technology, 2023, 16 : 943 - 967
  • [35] Spatio-temporal features based deep learning model for depression detection using two electrodes
    Choudhary, Shubham
    Bajpai, Manish Kumar
    Bharti, Kusum Kumari
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (08)
  • [36] Semantic analysis of action with spatio-temporal features based on object detection
    Chen, Cheng
    Wang, Yang
    Yi, Ke
    Wang, Tongxi
    Xiang, Hua
    Engineering Letters, 2020, 28 (02): : 616 - 623
  • [37] Foreground/Background-Masked Interaction Learning for Spatio-temporal Action Detection
    Chen, Keke
    Shu, Xiangbo
    Xie, Guo-Sen
    Yan, Rui
    Tang, Jinhui
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 2381 - 2390
  • [38] Semantic Analysis of Action with Spatio-Temporal Features Based on Object Detection
    Chen, Cheng
    Wang, Yang
    Yi, Ke
    Wang, Tongxi
    Xiang, Hua
    ENGINEERING LETTERS, 2020, 28 (02) : 616 - 623
  • [39] Heterogeneous spatio-temporal relation learning network for facial action unit detection
    Song, Wenyu
    Shi, Shuze
    Dong, Yu
    An, Gaoyun
    PATTERN RECOGNITION LETTERS, 2022, 164 : 268 - 275
  • [40] Flow Prediction in Spatio-Temporal Networks Based on Multitask Deep Learning
    Zhang, Junbo
    Zheng, Yu
    Sun, Junkai
    Qi, Dekang
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (03) : 468 - 478