A two-stage temporal proposal network for precise action localization in untrimmed video

被引:0
|
作者
Fei Wang
Guorui Wang
Yuxuan Du
Zhenquan He
Yong Jiang
机构
[1] Northeastern University,Faculty of Robot Science and Engineering
[2] Northeastern University,College of Information Science and Engineering
[3] Shenyang Institute of Automation Chinese Academy of Sciences,undefined
关键词
Action detection; Correctness discriminator; Extended context pooling; Temporal context regression;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we propose a two-stage temporal proposal algorithm for the action detection task of long untrimmed videos. In the first stage, we propose a novel prior-minor watershed algorithm for action proposals with precise prior watershed proposal algorithm and minor supplementary sliding window algorithm. Here, we propose the correctness discriminator to fill the proposals that watershed proposal algorithm may omit with the sliding window proposals. In the second stage, an extended context pooling (ECP) is firstly proposed with two modules (internal and context). The context information module of ECP can structure the proposals and enhance the extended features of action proposals. Different level of ECP is introduced to model the action proposal region and make its extended context region more targeted and precise. Then, we propose a temporal context regression network, which adopts a multi-task loss to realize the training of the temporal coordinate regression and the action/background classification simultaneously, and outputs the precise temporal boundaries of the proposals. Here, we also propose prior-minor ranking to balance the effect of the prior watershed proposals and the minor supplementary proposals. On three large scale benchmarks THUMOS14, ActivityNet (v1.2 and v1.3), and Charades, our approach achieves superior performances compared with other state-of-the-art methods and runs over 1020 frames per second (fps) on a single NVIDIA Titan-X Pascal GPU, indicating that our method can efficiently improve the precision of action localization task.
引用
收藏
页码:2199 / 2211
页数:12
相关论文
共 50 条
  • [31] FastPicker: Adaptive independent two-stage video-to-video summarization for efficient action recognition
    Alfasly, Saghir
    Lu, Jian
    Xu, Chen
    Al-Huda, Zaid
    Jiang, Qingtang
    Lu, Zhaosong
    Chui, Charles K.
    NEUROCOMPUTING, 2023, 516 : 231 - 244
  • [32] A Two-Stage Approach for Commonality-Based Temporal Localization of Periodic Motions
    Panagiotakis, Costas
    Argyros, Antonis
    COMPUTER VISION SYSTEMS (ICVS 2019), 2019, 11754 : 366 - 375
  • [33] OccTr: A Two-Stage BEV Fusion Network for Temporal Object Detection
    Fu, Qifang
    Yu, Xinyi
    Ou, Linlin
    ELECTRONICS, 2024, 13 (13)
  • [34] GCRNet: Global Context Relation Network for Weakly-Supervised Temporal Action Localization Identify the target actions in a long untrimmed video and find the corresponding action start point and end point
    Liao, Yiguan
    Qiu, Changzhen
    Zhang, Zhiyong
    Wang, Luping
    Wang, Liang
    2021 THE 5TH INTERNATIONAL CONFERENCE ON VIDEO AND IMAGE PROCESSING, ICVIP 2021, 2021, : 184 - 190
  • [35] Two-Stage Localization for Image Labeling
    Qu, Yanyun
    Wu, Diwei
    Chen, Yanyun
    Chen, Cheng
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING-PCM 2010, PT I, 2010, 6297 : 568 - 577
  • [36] Two-stage network games
    L. A. Petrosyan
    A. A. Sedakov
    A. O. Bochkarev
    Automation and Remote Control, 2016, 77 : 1855 - 1866
  • [37] Two-stage network games
    Petrosyan, L. A.
    Sedakov, A. A.
    Bochkarev, A. O.
    AUTOMATION AND REMOTE CONTROL, 2016, 77 (10) : 1855 - 1866
  • [38] Weakly Supervised Temporal Action Localization by Multi-Stage Fusion Network
    Shen, Zhengyang
    Wang, Feng
    Dai, Jin
    IEEE ACCESS, 2020, 8 : 17287 - 17298
  • [39] Hippocampus Localization Using a Two-Stage Ensemble Hough Convolutional Neural Network
    Basher, Abol
    Choi, Kyu Yeong
    Lee, Jang Jae
    Lee, Bumshik
    Kim, Byeong C.
    Lee, Kun Ho
    Jung, Ho Yub
    IEEE ACCESS, 2019, 7 : 73436 - 73447
  • [40] Temporal Context Aggregation Network for Temporal Action Proposal Refinement
    Qing, Zhiwu
    Su, Haisheng
    Gan, Weihao
    Wang, Dongliang
    Wu, Wei
    Wang, Xiang
    Qiao, Yu
    Yan, Junjie
    Gao, Changxin
    Sang, Nong
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 485 - 494