Diffusion-based framework for weakly-supervised temporal action localization

被引：0

作者：

Zou, Yuanbing ^{[1
]}

Zhao, Qingjie ^{[1
]}

Sarker, Prodip Kumar ^{[1
]}

Li, Shanshan ^{[1
]}

Wang, Lei ^{[2
]}

Liu, Wangwang ^{[2
]}

机构：

[1] School of Computer Science and Technology, Beijing Institute of Technology, Beijing,100081, China

[2] Beijing Institute of Control Engineering, Beijing,100190, China

来源：

Pattern Recognition | 2025年 / 160卷

关键词：

Adversarial machine learning - Contrastive Learning - Federated learning - Semantics - Semi-supervised learning;

D O I：

10.1016/j.patcog.2024.111207

中图分类号：

学科分类号：

摘要：

Weakly supervised temporal action localization aims to localize action instances with only video-level supervision. Due to the absence of frame-level annotation supervision, how effectively separate action snippets and backgrounds from semantically ambiguous features becomes an arduous challenge for this task. To address this issue from a generative modeling perspective, we propose a novel diffusion-based network with two stages. Firstly, we design a local masking mechanism module to learn the local semantic information and generate binary masks at the early stage, which (1) are used to perform action-background separation and (2) serve as pseudo-ground truth required by the diffusion module. Then, we propose a diffusion module to generate high-quality action predictions under the pseudo-ground truth supervision in the second stage. In addition, we further optimize the new-refining operation in the local masking module to improve the operation efficiency. The experimental results demonstrate that the proposed method achieves a promising performance on the publicly available mainstream datasets THUMOS14 and ActivityNet. The code is available at https://github.com/Rlab123/action_diff. © 2024

引用

共 50 条

[31] CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning
Zhang, Can
Cao, Meng
Yang, Dongming
Chen, Jie
Zou, Yuexian
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16005 - 16014
[32] Adversarial Seeded Sequence Growing for Weakly-Supervised Temporal Action Localization
Zhang, Chengwei
Xu, Yunlu
Cheng, Zhanzhan
Niu, Yi
Pu, Shiliang
Wu, Fei
Zou, Futai
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 738 - 746
[33] Weakly-supervised Temporal Action Localization with Adaptive Clustering and Refining Network
Ren, Hao
Ran, Wu
Liu, Xingson
Ren, Haoran
Lu, Hong
Zhang, Rui
Jin, Cheng
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1008 - 1013
[34] Dual-Evidential Learning for Weakly-supervised Temporal Action Localization
Chen, Mengyuan
Gao, Junyu
Yang, Shicai
Xu, Changsheng
COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 192 - 208
[35] Learning Background Suppression Model for Weakly-supervised Temporal Action Localization
Liu, Mengxue
Gao, Xiangjun
Ge, Fangzhen
Liu, Huaiyu
Li, Wenjing
IAENG International Journal of Computer Science, 2021, 48 (04):
[36] Unleashing the Potential of Adjacent Snippets for Weakly-supervised Temporal Action Localization
Liu, Qinying
Wang, Zilei
Chen, Ruoxi
Li, Zhilin
Proceedings - IEEE International Conference on Multimedia and Expo, 2023, 2023-July : 1032 - 1037
[37] Unleashing the Potential of Adjacent Snippets for Weakly-supervised Temporal Action Localization
Liu, Qinying
Wang, Zilei
Chen, Ruoxi
Li, Zhilin
2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1032 - 1037
[38] Weakly-supervised Action Localization with Background Modeling
Phuc Xuan Nguyen
Ramanan, Deva
Fowlkes, Charless C.
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5501 - 5510
[39] Fine-grained Temporal Contrastive Learning for Weakly-supervised Temporal Action Localization
Gao, Junyu
Chen, Mengyuan
Xu, Changsheng
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19967 - 19977
[40] W-ART: ACTION RELATION TRANSFORMER FOR WEAKLY-SUPERVISED TEMPORAL ACTION LOCALIZATION
Li, Mengzhu
Wu, Hongjun
Liu, Yongcheng
Liu, Hongzhe
Xu, Cheng
Li, Xuewei
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2195 - 2199

← 1 2 3 4 5 →