Diffusion-based framework for weakly-supervised temporal action localization

被引:0
|
作者
Zou, Yuanbing [1 ]
Zhao, Qingjie [1 ]
Sarker, Prodip Kumar [1 ]
Li, Shanshan [1 ]
Wang, Lei [2 ]
Liu, Wangwang [2 ]
机构
[1] School of Computer Science and Technology, Beijing Institute of Technology, Beijing,100081, China
[2] Beijing Institute of Control Engineering, Beijing,100190, China
关键词
Adversarial machine learning - Contrastive Learning - Federated learning - Semantics - Semi-supervised learning;
D O I
10.1016/j.patcog.2024.111207
中图分类号
学科分类号
摘要
Weakly supervised temporal action localization aims to localize action instances with only video-level supervision. Due to the absence of frame-level annotation supervision, how effectively separate action snippets and backgrounds from semantically ambiguous features becomes an arduous challenge for this task. To address this issue from a generative modeling perspective, we propose a novel diffusion-based network with two stages. Firstly, we design a local masking mechanism module to learn the local semantic information and generate binary masks at the early stage, which (1) are used to perform action-background separation and (2) serve as pseudo-ground truth required by the diffusion module. Then, we propose a diffusion module to generate high-quality action predictions under the pseudo-ground truth supervision in the second stage. In addition, we further optimize the new-refining operation in the local masking module to improve the operation efficiency. The experimental results demonstrate that the proposed method achieves a promising performance on the publicly available mainstream datasets THUMOS14 and ActivityNet. The code is available at https://github.com/Rlab123/action_diff. © 2024
引用
收藏
相关论文
共 50 条
  • [21] Deep Motion Prior for Weakly-Supervised Temporal Action Localization
    Cao, Meng
    Zhang, Can
    Chen, Long
    Shou, Mike Zheng
    Zou, Yuexian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5203 - 5213
  • [22] Weakly-Supervised Temporal Action Localization with Regional Similarity Consistency
    Ren, Haoran
    Ren, Hao
    Lu, Hong
    Jin, Cheng
    MULTIMEDIA MODELING, MMM 2023, PT I, 2023, 13833 : 69 - 81
  • [23] Weakly-supervised action localization based on seed superpixels
    Sami Ullah
    Naeem Bhatti
    Tehreem Qasim
    Najmul Hassan
    Muhammad Zia
    Multimedia Tools and Applications, 2021, 80 : 6203 - 6220
  • [24] Weakly-supervised action localization based on seed superpixels
    Ullah, Sami
    Bhatti, Naeem
    Qasim, Tehreem
    Hassan, Najmul
    Zia, Muhammad
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (04) : 6203 - 6220
  • [25] Semantic and Temporal Contextual Correlation Learning for Weakly-Supervised Temporal Action Localization
    Fu, Jie
    Gao, Junyu
    Xu, Changsheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12427 - 12443
  • [26] Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization
    Lee, Pilhyeon
    Byun, Hyeran
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13628 - 13637
  • [27] Deep cascaded action attention network for weakly-supervised temporal action localization
    Hui-fen Xia
    Yong-zhao Zhan
    Multimedia Tools and Applications, 2023, 82 : 29769 - 29787
  • [28] Deep cascaded action attention network for weakly-supervised temporal action localization
    Xia, Hui-fen
    Zhan, Yong-zhao
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (19) : 29769 - 29787
  • [29] ACGNet: Action Complement Graph Network for Weakly-Supervised Temporal Action Localization
    Yang, Zichen
    Qin, Jie
    Huang, Di
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3090 - 3098
  • [30] Proposal-based Multiple Instance Learning for Weakly-supervised Temporal Action Localization
    Ren, Huan
    Yang, Wenfei
    Zhang, Tianzhu
    Zhang, Yongdong
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2394 - 2404