Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization

被引:34
|
作者
Lee, Pilhyeon [1 ]
Byun, Hyeran [1 ,2 ]
机构
[1] Yonsei Univ, Dept Comp Sci, Seoul, South Korea
[2] Yonsei Univ, Grad Sch AI, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/ICCV48922.2021.01339
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We tackle the problem of localizing temporal intervals of actions with only a single frame label for each action instance for training. Owing to label sparsity, existing work fails to learn action completeness, resulting in fragmentary action predictions. In this paper, we propose a novel framework, where dense pseudo-labels are generated to provide completeness guidance for the model. Concretely, we first select pseudo background points to supplement point-level action labels. Then, by taking the points as seeds, we search for the optimal sequence that is likely to contain complete action instances while agreeing with the seeds. To learn completeness from the obtained sequence, we introduce two novel losses that contrast action instances with background ones in terms of action score and feature similarity, respectively. Experimental results demonstrate that our completeness guidance indeed helps the model to locate complete action instances, leading to large performance gains especially under high IoU thresholds. Moreover, we demonstrate the superiority of our method over existing state-of-the-art methods on four benchmarks: THUMOS'14, GTEA, BEOID, and ActivityNet. Notably, our method even performs comparably to recent fully-supervised methods, at the 6x cheaper annotation cost. Our code is available at https://github.com/Pilhyeon.
引用
收藏
页码:13628 / 13637
页数:10
相关论文
共 50 条
  • [1] Temporal RPN Learning for Weakly-Supervised Temporal Action Localization
    Huang, Jing
    Kong, Ming
    Chen, Luyuan
    Liang, Tian
    Zhu, Qiang
    [J]. ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222
  • [2] Action Completeness Modeling with Background Aware Networks for Weakly-Supervised Temporal Action Localization
    Moniruzzaman, Md
    Yin, Zhaozheng
    He, Zhihai
    Qin, Ruwen
    Leu, Ming C.
    [J]. MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2166 - 2174
  • [3] Vectorized Evidential Learning for Weakly-Supervised Temporal Action Localization
    Gao, Junyu
    Chen, Mengyuan
    Xu, Changsheng
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15949 - 15963
  • [4] Weakly-supervised temporal action localization: a survey
    AbdulRahman Baraka
    Mohd Halim Mohd Noor
    [J]. Neural Computing and Applications, 2022, 34 : 8479 - 8499
  • [5] Weakly-supervised temporal action localization: a survey
    Baraka, AbdulRahman
    Noor, Mohd Halim Mohd
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022, 34 (11): : 8479 - 8499
  • [6] ACTION RELATIONAL GRAPH FOR WEAKLY-SUPERVISED TEMPORAL ACTION LOCALIZATION
    Cheng, Yi
    Sun, Ying
    Lin, Dongyun
    Lim, Joo-Hwee
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2563 - 2567
  • [7] Action Coherence Network for Weakly-Supervised Temporal Action Localization
    Zhai, Yuanhao
    Wang, Le
    Tang, Wei
    Zhang, Qilin
    Zheng, Nanning
    Hua, Gang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 1857 - 1870
  • [8] Semantic and Temporal Contextual Correlation Learning for Weakly-Supervised Temporal Action Localization
    Fu, Jie
    Gao, Junyu
    Xu, Changsheng
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12427 - 12443
  • [9] CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning
    Zhang, Can
    Cao, Meng
    Yang, Dongming
    Chen, Jie
    Zou, Yuexian
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16005 - 16014
  • [10] Dual-Evidential Learning for Weakly-supervised Temporal Action Localization
    Chen, Mengyuan
    Gao, Junyu
    Yang, Shicai
    Xu, Changsheng
    [J]. COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 192 - 208