A Snippets Relation and Hard-Snippets Mask Network for Weakly-Supervised Temporal Action Localization

被引:0
|
作者
Zhao, Yibo [1 ]
Zhang, Hua [1 ]
Gao, Zan [1 ,2 ]
Guan, Weili [3 ]
Wang, Meng [4 ]
Chen, Shengyong [1 ]
机构
[1] Tianjin Univ Technol, Key Lab Comp Vis & Syst, Minist Educ, Tianjin 300384, Peoples R China
[2] Qilu Univ Technol, Shandong Acad Sci, Shandong Artificial Intelligence Inst, Jinan 250014, Peoples R China
[3] Monash Univ, Fac Informat Technol, Clayton Campus, Clayton, Vic 3800, Australia
[4] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China
基金
中国国家自然科学基金;
关键词
Location awareness; Task analysis; Proposals; Circuits and systems; Uncertainty; Prototypes; Multitasking; Weakly-supervised temporal action localization; snippets relation module; hard-snippets mask module; snippet enhancement loss; DISTILLATION;
D O I
10.1109/TCSVT.2024.3374870
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Weakly-supervised temporal action localization (WTAL) is a problem learning an action localization model with only video-level labels available. In recent years, many WTAL methods have developed. However, hard-to-predict snippets near action boundaries are often not considered in these existing approaches, causing action incompleteness and action over-complete issues. To solve these issues, in this work, an end-to-end snippets relation and hard-snippets mask network (SRHN) is proposed. Specifically, a hard-snippets mask module is applied to mask the hard-to-predict snippets adaptively, and in this way, the trained model focuses more on those snippets with low uncertainty. Then, a snippets relation module is designed to capture the relationship among snippets and can make hard-to-predict snippets easy to predict by aggregating the information of multiple temporal receptive fields. Finally, a snippet enhancement loss is further developed to reduce the action probabilities that are not present in videos for hard-to-predict snippets and other snippets, enlarging the action probabilities that exist in videos. Extensive experiments on THUMOS14, ActivityNet1.2, and ActivityNet1.3 datasets demonstrate the effectiveness of the SRHN method.
引用
收藏
页码:7202 / 7215
页数:14
相关论文
共 50 条
  • [31] A Hybrid Attention Mechanism for Weakly-Supervised Temporal Action Localization
    Islam, Ashraful
    Long, Chengjiang
    Radke, Richard
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1637 - 1645
  • [32] Deep Motion Prior for Weakly-Supervised Temporal Action Localization
    Cao, Meng
    Zhang, Can
    Chen, Long
    Shou, Mike Zheng
    Zou, Yuexian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 5203 - 5213
  • [33] Weakly-Supervised Temporal Action Localization with Regional Similarity Consistency
    Ren, Haoran
    Ren, Hao
    Lu, Hong
    Jin, Cheng
    MULTIMEDIA MODELING, MMM 2023, PT I, 2023, 13833 : 69 - 81
  • [34] Context Sensitive Network for weakly-supervised fine-grained temporal action localization
    Dong, Cerui
    Liu, Qinying
    Wang, Zilei
    Zhang, Yixin
    Zhao, Feng
    NEURAL NETWORKS, 2025, 185
  • [35] Adaptive Two-Stream Consensus Network for Weakly-Supervised Temporal Action Localization
    Zhai, Yuanhao
    Wang, Le
    Tang, Wei
    Zhang, Qilin
    Zheng, Nanning
    Doermann, David
    Yuan, Junsong
    Hua, Gang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (04) : 4136 - 4151
  • [36] Entropy guided attention network for weakly-supervised action localization
    Cheng, Yi
    Sun, Ying
    Fan, Hehe
    Zhuo, Tao
    Lim, Joo-Hwee
    Kankanhalli, Mohan
    PATTERN RECOGNITION, 2022, 129
  • [37] Semantic and Temporal Contextual Correlation Learning for Weakly-Supervised Temporal Action Localization
    Fu, Jie
    Gao, Junyu
    Xu, Changsheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12427 - 12443
  • [38] Learning Action Completeness from Points for Weakly-supervised Temporal Action Localization
    Lee, Pilhyeon
    Byun, Hyeran
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13628 - 13637
  • [39] Action-Aware Network with Upper and Lower Limit Loss for Weakly-Supervised Temporal Action Localization
    Bi, Mingwen
    Li, Jiaqi
    Liu, Xinliang
    Zhang, Qingchuan
    Yang, Zhenghong
    NEURAL PROCESSING LETTERS, 2023, 55 (04) : 4307 - 4324
  • [40] Action-Aware Network with Upper and Lower Limit Loss for Weakly-Supervised Temporal Action Localization
    Mingwen Bi
    Jiaqi Li
    Xinliang Liu
    Qingchuan Zhang
    Zhenghong Yang
    Neural Processing Letters, 2023, 55 : 4307 - 4324