Non-Local Temporal Difference Network for Temporal Action Detection

被引:3
|
作者
He, Yilong [1 ,2 ]
Han, Xiao [1 ,2 ]
Zhong, Yong [1 ,2 ]
Wang, Lishun [1 ,2 ]
机构
[1] Chinese Acad Sci, Chengdu Inst Comp Applicat, Chengdu 610081, Peoples R China
[2] Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 100049, Peoples R China
关键词
temporal action detection; deep learning; convolutional neural networks; computer vision; video understanding;
D O I
10.3390/s22218396
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
As an important part of video understanding, temporal action detection (TAD) has wide application scenarios. It aims to simultaneously predict the boundary position and class label of every action instance in an untrimmed video. Most of the existing temporal action detection methods adopt a stacked convolutional block strategy to model long temporal structures. However, most of the information between adjacent frames is redundant, and distant information is weakened after multiple convolution operations. In addition, the durations of action instances vary widely, making it difficult for single-scale modeling to fit complex video structures. To address this issue, we propose a non-local temporal difference network (NTD), including a chunk convolution (CC) module, a multiple temporal coordination (MTC) module, and a temporal difference (TD) module. The TD module adaptively enhances the motion information and boundary features with temporal attention weights. The CC module evenly divides the input sequence into N chunks, using multiple independent convolution blocks to simultaneously extract features from neighboring chunks. Therefore, it realizes the information delivered from distant frames while avoiding trapping into the local convolution. The MTC module designs a cascade residual architecture, which realizes the multiscale temporal feature aggregation without introducing additional parameters. The NTD achieves a state-of-the-art performance on two large-scale datasets, 36.2% mAP@avg and 71.6% mAP@0.5 on ActivityNet-v1.3 and THUMOS-14, respectively.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Non-local temporal interference
    Ali Ayatollah Rafsanjani
    MohammadJavad Kazemi
    Vahid Hosseinzadeh
    Mehdi Golshani
    [J]. Scientific Reports, 14
  • [2] Non-local temporal interference
    Rafsanjani, Ali Ayatollah
    Kazemi, MohammadJavad
    Hosseinzadeh, Vahid
    Golshani, Mehdi
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01)
  • [3] NON-LOCAL STORAGE OF TEMPORAL INFORMATION
    LONGUETHIGGINS, HC
    [J]. PROCEEDINGS OF THE ROYAL SOCIETY SERIES B-BIOLOGICAL SCIENCES, 1968, 171 (1024): : 327 - +
  • [4] On the wave equation with a temporal non-local term
    Medjden, Mohamed
    Tatar, Nasser-Eddinne
    [J]. DYNAMIC SYSTEMS AND APPLICATIONS, 2007, 16 (04): : 665 - 671
  • [5] Global Temporal Difference Network for Action Recognition
    Xie, Zhao
    Chen, Jiansong
    Wu, Kewei
    Guo, Dan
    Hong, Richang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 7594 - 7606
  • [6] GLFormer: Global and Local Context Aggregation Network for Temporal Action Detection
    He, Yilong
    Zhong, Yong
    Wang, Lishun
    Dang, Jiachen
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [7] Motif-GCNs With Local and Non-Local Temporal Blocks for Skeleton-Based Action Recognition
    Wen, Yu-Hui
    Gao, Lin
    Fu, Hongbo
    Zhang, Fang-Lue
    Xia, Shihong
    Liu, Yong-Jin
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2009 - 2023
  • [8] A Non-local Mean Temporal Filter for Video Compression
    Chen, Cheng
    Han, Jingning
    Xu, Yaowu
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1142 - 1146
  • [9] Spatio-Temporal Adaptive Network With Bidirectional Temporal Difference for Action Recognition
    Li, Zhilei
    Li, Jun
    Ma, Yuqing
    Wang, Rui
    Shi, Zhiping
    Ding, Yifu
    Liu, Xianglong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5174 - 5185
  • [10] Depthwise Temporal Non-Local Network for Faster and Better Dynamic Hand Gesture Authentication
    Song, Wenwei
    Kang, Wenxiong
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2023, 18 : 1870 - 1883