Deep cascaded action attention network for weakly-supervised temporal action localization

被引：0

作者：

Hui-fen Xia

Yong-zhao Zhan

机构：

[1] Jiangsu University,School of Computer Science and Communication Engineering

[2] Changzhou Vocational Institute of Mechatronic Technology,undefined

[3] Jiangsu Engineering Research Center of Big Data Ubiquitous Perception and Intelligent Agriculture Applications,undefined

来源：

Multimedia Tools and Applications | 2023年 / 82卷

关键词：

Weakly-supervised; Temporal action localization; Deep cascaded action attention; Non-action suppression;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Weakly-supervised temporal action localization (W-TAL) is to locate the boundaries of action instances and classify them in an untrimmed video, which is a challenging task due to only video-level labels during training. Existing methods mainly focus on the most discriminative action snippets of a video by using top-k multiple instance learning (MIL), and ignore the usage of less discriminative action snippets and non-action snippets. This makes the localization performance improve limitedly. In order to mine the less discriminative action snippets and distinguish the non-action snippets better in a video, a novel method based on deep cascaded action attention network is proposed. In this method, the deep cascaded action attention mechanism is presented to model not only the most discriminative action snippets, but also different levels of less discriminative action snippets by introducing threshold erasing, which ensures the completeness of action instances. Besides, the entropy loss for non-action is introduced to restrict the activations of non-action snippets for all action categories, which are generated by aggregating the bottom-k activation scores along the temporal dimension. Thereby, the action snippets can be distinguished from non-action snippets better, which is beneficial to the separation of action and non-action snippets and enables the action instances more accurate. Ultimately, our method can facilitate more precise action localization. Extensive experiments conducted on THUMOS14 and ActivityNet1.3 datasets show that our method outperforms state-of-the-art methods at several t-IoU thresholds.

引用

页码：29769 / 29787

页数：18

共 50 条

[41] Action-Aware Network with Upper and Lower Limit Loss for Weakly-Supervised Temporal Action Localization
Mingwen Bi
Jiaqi Li
Xinliang Liu
Qingchuan Zhang
Zhenghong Yang
Neural Processing Letters, 2023, 55 : 4307 - 4324
[42] Deep snippet selective network for weakly supervised temporal action localization
Ge, Yongxin
Qin, Xiaolei
Yang, Dan
Jagersand, Martin
PATTERN RECOGNITION, 2021, 110
[43] Self-supervised temporal adaptive learning for weakly-supervised temporal action localization
Sheng, Jinrong
Yu, Jiaruo
Li, Ziqiang
Li, Ao
Ge, Yongxin
INFORMATION SCIENCES, 2025, 705
[44] Weakly-supervised temporal attention 3D network for human action recognition
Kim, Jonghyun
Li, Gen
Yun, Inyong
Jung, Cheolkon
Kim, Joongkyu
PATTERN RECOGNITION, 2021, 119
[45] Context Sensitive Network for weakly-supervised fine-grained temporal action localization
Dong, Cerui
Liu, Qinying
Wang, Zilei
Zhang, Yixin
Zhao, Feng
NEURAL NETWORKS, 2025, 185
[46] Adaptive Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization
Zhai, Yuanhao
Wang, Le
Tang, Wei
Zhang, Qilin
Zheng, Nanning
Doermann, David
Yuan, Junsong
Hua, Gang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4136 - 4151
[47] Weakly-Supervised Temporal Action Localization with Multi-Head Cross-Modal Attention
Ren, Hao
Ren, Haoran
Ran, Wu
Lu, Hong
Jin, Cheng
PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2022, 13631 : 281 - 295
[48] W-ART: ACTION RELATION TRANSFORMER FOR WEAKLY-SUPERVISED TEMPORAL ACTION LOCALIZATION
Li, Mengzhu
Wu, Hongjun
Liu, Yongcheng
Liu, Hongzhe
Xu, Cheng
Li, Xuewei
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2195 - 2199
[49] Action Completeness Modeling with Background Aware Networks for Weakly-Supervised Temporal Action Localization
Moniruzzaman, Md
Yin, Zhaozheng
He, Zhihai
Qin, Ruwen
Leu, Ming C.
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2166 - 2174
[50] Semantic and Temporal Contextual Correlation Learning for Weakly-Supervised Temporal Action Localization
Fu, Jie
Gao, Junyu
Xu, Changsheng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12427 - 12443

← 1 2 3 4 5 →