A Snippets Relation and Hard-Snippets Mask Network for Weakly-Supervised Temporal Action Localization

被引:0
|
作者
Zhao, Yibo [1 ]
Zhang, Hua [1 ]
Gao, Zan [1 ,2 ]
Guan, Weili [3 ]
Wang, Meng [4 ]
Chen, Shengyong [1 ]
机构
[1] Tianjin Univ Technol, Key Lab Comp Vis & Syst, Minist Educ, Tianjin 300384, Peoples R China
[2] Qilu Univ Technol, Shandong Acad Sci, Shandong Artificial Intelligence Inst, Jinan 250014, Peoples R China
[3] Monash Univ, Fac Informat Technol, Clayton Campus, Clayton, Vic 3800, Australia
[4] Hefei Univ Technol, Sch Comp Sci & Informat Engn, Hefei 230009, Peoples R China
基金
中国国家自然科学基金;
关键词
Location awareness; Task analysis; Proposals; Circuits and systems; Uncertainty; Prototypes; Multitasking; Weakly-supervised temporal action localization; snippets relation module; hard-snippets mask module; snippet enhancement loss; DISTILLATION;
D O I
10.1109/TCSVT.2024.3374870
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Weakly-supervised temporal action localization (WTAL) is a problem learning an action localization model with only video-level labels available. In recent years, many WTAL methods have developed. However, hard-to-predict snippets near action boundaries are often not considered in these existing approaches, causing action incompleteness and action over-complete issues. To solve these issues, in this work, an end-to-end snippets relation and hard-snippets mask network (SRHN) is proposed. Specifically, a hard-snippets mask module is applied to mask the hard-to-predict snippets adaptively, and in this way, the trained model focuses more on those snippets with low uncertainty. Then, a snippets relation module is designed to capture the relationship among snippets and can make hard-to-predict snippets easy to predict by aggregating the information of multiple temporal receptive fields. Finally, a snippet enhancement loss is further developed to reduce the action probabilities that are not present in videos for hard-to-predict snippets and other snippets, enlarging the action probabilities that exist in videos. Extensive experiments on THUMOS14, ActivityNet1.2, and ActivityNet1.3 datasets demonstrate the effectiveness of the SRHN method.
引用
收藏
页码:7202 / 7215
页数:14
相关论文
共 50 条
  • [21] AutoLoc: Weakly-Supervised Temporal Action Localization in Untrimmed Videos
    Shou, Zheng
    Gao, Hang
    Zhang, Lei
    Miyazawa, Kazuyuki
    Chang, Shih-Fu
    COMPUTER VISION - ECCV 2018, PT XVI, 2018, 11220 : 162 - 179
  • [22] Deep Motion Prior for Weakly-Supervised Temporal Action Localization
    Cao, Meng
    Zhang, Can
    Chen, Long
    Shou, Mike Zheng
    Zou, Yuexian
    IEEE Transactions on Image Processing, 2022, 31 : 5203 - 5213
  • [23] Dynamic Graph Modeling for Weakly-Supervised Temporal Action Localization
    Shi, Haichao
    Zhang, Xiao-Yu
    Li, Changsheng
    Gong, Lixing
    Li, Yong
    Bao, Yongjun
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3820 - 3828
  • [24] Adaptive Mutual Supervision for Weakly-Supervised Temporal Action Localization
    Ju, Chen
    Zhao, Peisen
    Chen, Siheng
    Zhang, Ya
    Zhang, Xiaoyun
    Wang, Yanfeng
    Tian, Qi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 6688 - 6701
  • [25] Vectorized Evidential Learning for Weakly-Supervised Temporal Action Localization
    Gao, Junyu
    Chen, Mengyuan
    Xu, Changsheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15949 - 15963
  • [26] Dynamic Graph Modeling for Weakly-Supervised Temporal Action Localization
    Shi, Haichao
    Zhang, Xiao-Yu
    Li, Changsheng
    Gong, Lixing
    Li, Yong
    Bao, Yongjun
    MM 2022 - Proceedings of the 30th ACM International Conference on Multimedia, 2022, : 3820 - 3828
  • [27] Boosting Weakly-Supervised Temporal Action Localization with Text Information
    Li, Guozhang
    Cheng, De
    Ding, Xinpeng
    Wang, Nannan
    Wang, Xiaoyu
    Gao, Xinbo
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 10648 - 10657
  • [28] Weakly-Supervised Temporal Action Localization by Progressive Complementary Learning
    Du, Jia-Run
    Feng, Jia-Chang
    Lin, Kun-Yu
    Hong, Fa-Ting
    Qi, Zhongang
    Shan, Ying
    Hu, Jian-Fang
    Zheng, Wei-Shi
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2025, 35 (01) : 938 - 952
  • [29] Self-supervised temporal adaptive learning for weakly-supervised temporal action localization
    Sheng, Jinrong
    Yu, Jiaruo
    Li, Ziqiang
    Li, Ao
    Ge, Yongxin
    INFORMATION SCIENCES, 2025, 705
  • [30] Complementary adversarial mechanisms for weakly-supervised temporal action localization
    Wang, Chuanxu
    Wang, Jing
    Liu, Peng
    PATTERN RECOGNITION, 2023, 139