SMC: Single-Stage Multi-location Convolutional Network for Temporal Action Detection

被引:0
|
作者
Liu, Zhikang [1 ,2 ]
Wang, Zilei [1 ]
Zhao, Yan [1 ]
Tian, Ye [1 ]
机构
[1] Univ Sci & Technol China, Dept Automat, Hefei, Anhui, Peoples R China
[2] Megvii Inc Face, Beijing, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Temporal action detection; End-to-end; Multi-scale; SMC;
D O I
10.1007/978-3-030-20890-5_12
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal action detection in untrimmed videos is an important and challenging visual task. State-of-the-art works always adopt a multi-stage pipeline, i.e., a class-agnostic segment proposal followed by a multi-label action classification. This pipeline is computationally slow and hard to optimize as each stage need be trained separately. Moreover, a desirable method should go beyond segment-level localization and make dense predictions with precise boundaries. We introduce a novel detection model in this paper, Single-stage Multi-location Convolutional Network (SMC), which completely eliminates the proposal generation and spatio-temporal feature resampling, and predicts frame-level action locations with class probabilities in a unified end-to-end network. Specifically, we associate a set of multi-scale default locations with each feature map cell in multiple layers, then predict the location offsets to the default locations, as well as action categories. SMC in practice is faster than the existing methods (753 FPS on a Titan X Maxwell GPU) and achieves state-of-the-art performance on THUMOS'14 and MEXaction2.
引用
收藏
页码:179 / 195
页数:17
相关论文
共 50 条
  • [21] Action recognition based on multi-stage jointly training convolutional network
    Hanling Zhang
    Chenxing Xia
    Xiuju Gao
    Multimedia Tools and Applications, 2019, 78 : 9919 - 9931
  • [22] Action recognition based on multi-stage jointly training convolutional network
    Zhang, Hanling
    Xia, Chenxing
    Gao, Xiuju
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (08) : 9919 - 9931
  • [23] DenseGCN: A multi-level and multi-temporal graph convolutional network for action recognition
    Yu, Chengzhang
    Bao, Wenxia
    IET IMAGE PROCESSING, 2023, 17 (12) : 3401 - 3410
  • [24] Multi-scale Dynamic Network for Temporal Action Detection
    Ren, Yifan
    Xu, Xing
    Shen, Fumin
    Wang, Zheng
    Yang, Yang
    Shen, Heng Tao
    PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL (ICMR '21), 2021, : 267 - 275
  • [25] A Compact and Powerful Single-Stage Network for Multi-Person Pose Estimation
    Xiao, Yabo
    Wang, Xiaojuan
    He, Mingshu
    Jin, Lei
    Song, Mei
    Zhao, Jian
    ELECTRONICS, 2023, 12 (04)
  • [26] TMMF: Temporal Multi-Modal Fusion for Single-Stage Continuous Gesture Recognition
    Gammulle, Harshala
    Denman, Simon
    Sridharan, Sridha
    Fookes, Clinton
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7689 - 7701
  • [27] Unsupervised multi-stage attack detection framework without details on single-stage attacks
    Shin, Jinmyeong
    Choi, Seok-Hwan
    Liu, Peng
    Choi, Yoon-Ho
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 100 : 811 - 825
  • [28] Single-stage oriented object detection via Corona Heatmap and Multi-stage Prediction
    Song, Beihang
    Li, Jing
    Wu, Jia
    Xue, Shan
    Chang, Jun
    Wan, Jun
    KNOWLEDGE-BASED SYSTEMS, 2024, 295
  • [29] An Enhanced Single-Stage Neural Network for Object Detection in Transmission Line Inspection
    Cai, Changyu
    Nie, Jianglong
    Tong, Jie
    Chen, Zhao
    Xu, Xiangnan
    He, Zhouqiang
    ELECTRONICS, 2024, 13 (11)
  • [30] Cascaded-LaneAFA: a single-stage traffic lane line detection network
    Xu W.
    Meng X.
    Du X.
    Hu Y.
    Multimedia Tools and Applications, 2025, 84 (11) : 9241 - 9256