Snippet-level Supervised Contrastive Learning-based Transformer for Temporal Action Detection

被引:0
|
作者
Xu, Ronghai [1 ]
Liu, Changhong [1 ]
Chen, Yong [2 ]
Lei, Zhenchun [1 ]
机构
[1] Jiangxi Normal Univ, Sch Comp & Informat Engn, Nanchang, Jiangxi, Peoples R China
[2] Nanchang Inst Technol, Sch Business Adm, Nanchang, Jiangxi, Peoples R China
基金
中国国家自然科学基金;
关键词
temporal action detection; supervised contrastive learning; transformer; action proposal generation;
D O I
10.1109/IJCNN54540.2023.10191802
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Anchor-free temporal action detection methods have recently achieved many good results in solving the problem of flexible boundaries and different duration of actions. But the anchor-free methods use local features to predict the action boundaries so that it is sensitive to noises and prone to generate incomplete action proposals. Moreover, there exist long-term temporal dependencies between actions and temporal semantic consistency between action primitives in the same classes of actions. Therefore, we propose a snippet-level supervised contrastive learning-based transformer (SSCL-T) model for temporal action detection, which can learn semantically local and global temporal relationships in actions. This model learns the local temporal dynamic features of actions through local temporal coding and uses the transformer to model the global semantic dependencies between long-term actions. In addition, we utilize the action class information to learn the high-level semantic features of actions by designing a snippet-level supervised contrastive learning, forcing the temporal dynamic features of the same class of actions to be as close as possible and the features of different classes of actions to be as far away as possible, thus effectively realizing accurate prediction of action boundaries. Our model has been verified on two benchmark datasets ActivityNet-v1.3 and THUMOS14. The experimental results demonstrate that the proposed model has significantly improved on both datasets. Compared with the benchmark method BMN, the average mAP value has increased by 2.91% and 8.4% on ActivityNet-v1.3 and THUMOS14, respectively.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] A balanced supervised contrastive learning-based method for encrypted network traffic classification
    Ma, Yuxiang
    Li, Zhaodi
    Xue, Haoming
    Chang, Jike
    COMPUTERS & SECURITY, 2024, 145
  • [32] Supervised Contrastive Learning-Based Unsupervised Domain Adaptation for Hyperspectral Image Classification
    Li, Zhaokui
    Xu, Qiang
    Ma, Li
    Fang, Zhuoqun
    Wang, Yan
    He, Wenqiang
    Du, Qian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [33] Weakly Supervised Region-Level Contrastive Learning for Efficient Object Detection
    Deng, Yuang
    Zhang, Yuhang
    Dai, Wenrui
    Zhang, Xiaopeng
    Xiong, Hongkai
    2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
  • [34] Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature
    Yun, Wulian
    Qi, Mengshi
    Wang, Chuanming
    Ma, Huadong
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 6908 - 6916
  • [35] Contrastive Learning-Based Anomaly Detection for Actual Corporate Environments
    An, Gi-taek
    Park, Jung-min
    Lee, Kyung-soon
    SENSORS, 2023, 23 (10)
  • [36] Supervised Contrastive Learning for Voice Activity Detection
    Heo, Youngjun
    Lee, Sunggu
    ELECTRONICS, 2023, 12 (03)
  • [37] Supervised Machine Learning-Based Detection of Concrete Efflorescence
    Fan, Ching-Lung
    Chung, Yu-Jen
    SYMMETRY-BASEL, 2022, 14 (11):
  • [38] Actionness Inconsistency-Guided Contrastive Learning for Weakly-Supervised Temporal Action Localization
    Li, Zhilin
    Wang, Zilei
    Liu, Qinying
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 1513 - 1521
  • [39] Supervised Contrastive Learning For Musical Onset Detection
    Bolt, James
    Fazekas, Gyorgy
    PROCEEDINGS OF THE 18TH INTERNATIONAL AUDIO MOSTLY CONFERENCE, AM 2023, 2023, : 130 - 135
  • [40] TSCL: Timestamp Supervised Contrastive Learning for Action Segmentation
    Patsch, Constantin
    Wu, Yuankai
    Salihu, Driton
    Zakour, Marsil
    Steinbach, Eckehard
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (09): : 7485 - 7492