TAN: a temporal-aware attention network with context-rich representation for boosting proposal generation

被引:0
|
作者
Jiao, Yanyan [1 ,2 ]
Yang, Wenzhu [1 ,2 ]
Xing, Wenjie [1 ,2 ]
Zeng, Shuang [1 ,2 ]
Geng, Lei [1 ,2 ]
机构
[1] Hebei Univ, Sch Cyber Secur & Comp, Baoding 071002, Peoples R China
[2] Hebei Univ, Machine Vis Engn Res Ctr, Baoding 071002, Peoples R China
关键词
Temporal action proposal generation; Temporal action detection; Global-aware attention; Adaptive temporal interaction;
D O I
10.1007/s40747-024-01343-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporal action proposal generation in an untrimmed video is very challenging, and comprehensive context exploration is critically important to generate accurate candidates of action instances. This paper proposes a Temporal-aware Attention Network (TAN) that localizes context-rich proposals by enhancing the temporal representations of boundaries and proposals. Firstly, we pinpoint that obtaining precise location information of action instances needs to consider long-distance temporal contexts. To this end, we propose a Global-Aware Attention (GAA) module for boundary-level interaction. Specifically, we introduce two novel gating mechanisms into the top-down interaction structure to incorporate multi-level semantics into video features effectively. Secondly, we design an efficient task-specific Adaptive Temporal Interaction (ATI) module to learn proposal associations. TAN enhances proposal-level contextual representations in a wide range by utilizing multi-scale interaction modules. Extensive experiments on the ActivityNet-1.3 and THUMOS-14 demonstrate the effectiveness of our proposed method, e.g., TAN achieves 73.43% in AR@1000 on THUMOS-14 and 69.01% in AUC on ActivityNet-1.3. Moreover, TAN significantly improves temporal action detection performance when equipped with existing action classification frameworks.
引用
收藏
页码:3691 / 3708
页数:18
相关论文
共 12 条
  • [1] TAN: a temporal-aware attention network with context-rich representation for boosting proposal generation
    Yanyan Jiao
    Wenzhu Yang
    Wenjie Xing
    Shuang Zeng
    Lei Geng
    [J]. Complex & Intelligent Systems, 2024, 10 : 3691 - 3708
  • [2] A Temporal-Aware Relation and Attention Network for Temporal Action Localization
    Zhao, Yibo
    Zhang, Hua
    Gao, Zan
    Guan, Weili
    Nie, Jie
    Liu, Anan
    Wang, Meng
    Chen, Shengyong
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4746 - 4760
  • [3] Accurate Temporal Action Proposal Generation with Relation-Aware Pyramid Network
    Gao, Jialin
    Shi, Zhixiang
    Wang, Guanshuo
    Li, Jiani
    Yuan, Yufeng
    Ge, Shiming
    Zhou, Xi
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10810 - 10817
  • [4] Online action proposal generation using spatio-temporal attention network
    Keisham, Kanchan
    Jalali, Amin
    Lee, Minho
    [J]. NEURAL NETWORKS, 2022, 153 : 518 - 529
  • [5] Temporal Context Modeling Network with Local-Global Complementary Architecture for Temporal Proposal Generation
    Yuan, Yunfeng
    Yang, Wenzhu
    Luo, Zifei
    Gou, Ruru
    [J]. ELECTRONICS, 2022, 11 (17)
  • [6] SARNet: Self-attention Assisted Ranking Network for Temporal Action Proposal Generation
    Yu, Jiahao
    Hong, Jiang
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 1062 - 1067
  • [7] Enhanced Semantic Representation Learning for Sarcasm Detection by Integrating Context-Aware Attention and Fusion Network
    Hao, Shufeng
    Yao, Jikun
    Shi, Chongyang
    Zhou, Yu
    Xu, Shuang
    Li, Dengao
    Cheng, Yinghan
    [J]. ENTROPY, 2023, 25 (06)
  • [8] Context-aware temporal network representation of event logs: Model and methods for process performance analysis
    Senderovich, Arik
    Weidlich, Matthias
    Gal, Avigdor
    [J]. INFORMATION SYSTEMS, 2019, 84 : 240 - 254
  • [9] Improvement of Multimodal Emotion Recognition Based on Temporal-Aware Bi-Direction Multi-Scale Network and Multi-Head Attention Mechanisms
    Wu, Yuezhou
    Zhang, Siling
    Li, Pengfei
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [10] Spatial-Temporal Context-Aware Location Prediction Based on Bidirectional Self-Attention Network
    Lin, Kuijie
    Chen, Junxin
    Lian, Xiaoqin
    Mai, Weimin
    Guo, Zhiheng
    Chen, Xiang
    Hsu, Terng-Yin
    [J]. 2022 14TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS AND SIGNAL PROCESSING, WCSP, 2022, : 701 - 706