Diffusion-based framework for weakly-supervised temporal action localization

被引：0

作者：

Zou, Yuanbing ^{[1
]}

Zhao, Qingjie ^{[1
]}

Sarker, Prodip Kumar ^{[1
]}

Li, Shanshan ^{[1
]}

Wang, Lei ^{[2
]}

Liu, Wangwang ^{[2
]}

机构：

[1] School of Computer Science and Technology, Beijing Institute of Technology, Beijing,100081, China

[2] Beijing Institute of Control Engineering, Beijing,100190, China

来源：

Pattern Recognition | 2025年 / 160卷

关键词：

Adversarial machine learning - Contrastive Learning - Federated learning - Semantics - Semi-supervised learning;

D O I：

10.1016/j.patcog.2024.111207

中图分类号：

学科分类号：

摘要：

Weakly supervised temporal action localization aims to localize action instances with only video-level supervision. Due to the absence of frame-level annotation supervision, how effectively separate action snippets and backgrounds from semantically ambiguous features becomes an arduous challenge for this task. To address this issue from a generative modeling perspective, we propose a novel diffusion-based network with two stages. Firstly, we design a local masking mechanism module to learn the local semantic information and generate binary masks at the early stage, which (1) are used to perform action-background separation and (2) serve as pseudo-ground truth required by the diffusion module. Then, we propose a diffusion module to generate high-quality action predictions under the pseudo-ground truth supervision in the second stage. In addition, we further optimize the new-refining operation in the local masking module to improve the operation efficiency. The experimental results demonstrate that the proposed method achieves a promising performance on the publicly available mainstream datasets THUMOS14 and ActivityNet. The code is available at https://github.com/Rlab123/action_diff. © 2024

引用

共 50 条

[41] Temporal Feature Enhancement Dilated Convolution Network for Weakly-supervised Temporal Action Localization
Zhou, Jianxiong
Wu, Ying
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6017 - 6026
[42] Action Completeness Modeling with Background Aware Networks for Weakly-Supervised Temporal Action Localization
Moniruzzaman, Md
Yin, Zhaozheng
He, Zhihai
Qin, Ruwen
Leu, Ming C.
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2166 - 2174
[43] Fusion detection network with discriminative enhancement for weakly-supervised temporal action localization
Liu, Yuanyuan
Zhu, Hong
Ren, Haohao
Shi, Jing
Wang, Dong
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
[44] PivoTAL: Prior-Driven Supervision for Weakly-Supervised Temporal Action Localization
Rizve, Mamshad Nayeem
Mittal, Gaurav
Yu, Ye
Hall, Matthew
Sajeev, Sandra
Shah, Mubarak
Chen, Mei
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22992 - 23002
[45] Multi-Hierarchical Category Supervision for Weakly-Supervised Temporal Action Localization
Li, Guozhang
Li, Jie
Wang, Nannan
Ding, Xinpeng
Li, Zhifeng
Gao, Xinbo
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 9332 - 9344
[46] Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature
Yun, Wulian
Qi, Mengshi
Wang, Chuanming
Ma, Huadong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 6908 - 6916
[47] GRAPH REGULARIZATION NETWORK WITH SEMANTIC AFFINITY FOR WEAKLY-SUPERVISED TEMPORAL ACTION LOCALIZATION
Park, Jungin
Lee, Jiyoung
Jeon, Sangryul
Kim, Seungryong
Sohn, Kwanghoon
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3701 - 3705
[48] A Novel Action Saliency and Context-Aware Network for Weakly-Supervised Temporal Action Localization
Zhao, Yibo
Zhang, Hua
Gao, Zan
Gao, Wenjie
Wang, Meng
Chen, Shengyong
IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8253 - 8266
[49] RefineLoc: Iterative Refinement for Weakly-Supervised Action Localization
Pardo, Alejandro
Alwassel, Humam
Heilbron, Fabian Caba
Thabet, Ali
Ghanem, Bernard
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3318 - 3327
[50] Spatial–temporal correlations learning and action-background jointed attention for weakly-supervised temporal action localization
Huifen Xia
Yongzhao Zhan
Keyang Cheng
Multimedia Systems, 2022, 28 : 1529 - 1541

← 1 2 3 4 5 →