Snippet-level Supervised Contrastive Learning-based Transformer for Temporal Action Detection

被引：0

作者：

Xu, Ronghai ^{[1
]}

Liu, Changhong ^{[1
]}

Chen, Yong ^{[2
]}

Lei, Zhenchun ^{[1
]}

机构：

[1] Jiangxi Normal Univ, Sch Comp & Informat Engn, Nanchang, Jiangxi, Peoples R China

[2] Nanchang Inst Technol, Sch Business Adm, Nanchang, Jiangxi, Peoples R China

来源：

2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年

基金：

中国国家自然科学基金;

关键词：

temporal action detection; supervised contrastive learning; transformer; action proposal generation;

D O I：

10.1109/IJCNN54540.2023.10191802

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Anchor-free temporal action detection methods have recently achieved many good results in solving the problem of flexible boundaries and different duration of actions. But the anchor-free methods use local features to predict the action boundaries so that it is sensitive to noises and prone to generate incomplete action proposals. Moreover, there exist long-term temporal dependencies between actions and temporal semantic consistency between action primitives in the same classes of actions. Therefore, we propose a snippet-level supervised contrastive learning-based transformer (SSCL-T) model for temporal action detection, which can learn semantically local and global temporal relationships in actions. This model learns the local temporal dynamic features of actions through local temporal coding and uses the transformer to model the global semantic dependencies between long-term actions. In addition, we utilize the action class information to learn the high-level semantic features of actions by designing a snippet-level supervised contrastive learning, forcing the temporal dynamic features of the same class of actions to be as close as possible and the features of different classes of actions to be as far away as possible, thus effectively realizing accurate prediction of action boundaries. Our model has been verified on two benchmark datasets ActivityNet-v1.3 and THUMOS14. The experimental results demonstrate that the proposed model has significantly improved on both datasets. Compared with the benchmark method BMN, the average mAP value has increased by 2.91% and 8.4% on ActivityNet-v1.3 and THUMOS14, respectively.

引用

页数：8

共 50 条

[31] A balanced supervised contrastive learning-based method for encrypted network traffic classification
Ma, Yuxiang
Li, Zhaodi
Xue, Haoming
Chang, Jike
COMPUTERS & SECURITY, 2024, 145
[32] Supervised Contrastive Learning-Based Unsupervised Domain Adaptation for Hyperspectral Image Classification
Li, Zhaokui
Xu, Qiang
Ma, Li
Fang, Zhuoqun
Wang, Yan
He, Wenqiang
Du, Qian
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[33] Weakly Supervised Region-Level Contrastive Learning for Efficient Object Detection
Deng, Yuang
Zhang, Yuhang
Dai, Wenrui
Zhang, Xiaopeng
Xiong, Hongkai
2022 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2022,
[34] Weakly-Supervised Temporal Action Localization by Inferring Salient Snippet-Feature
Yun, Wulian
Qi, Mengshi
Wang, Chuanming
Ma, Huadong
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 6908 - 6916
[35] Contrastive Learning-Based Anomaly Detection for Actual Corporate Environments
An, Gi-taek
Park, Jung-min
Lee, Kyung-soon
SENSORS, 2023, 23 (10)
[36] Supervised Contrastive Learning for Voice Activity Detection
Heo, Youngjun
Lee, Sunggu
ELECTRONICS, 2023, 12 (03)
[37] Supervised Machine Learning-Based Detection of Concrete Efflorescence
Fan, Ching-Lung
Chung, Yu-Jen
SYMMETRY-BASEL, 2022, 14 (11):
[38] Actionness Inconsistency-Guided Contrastive Learning for Weakly-Supervised Temporal Action Localization
Li, Zhilin
Wang, Zilei
Liu, Qinying
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 1513 - 1521
[39] Supervised Contrastive Learning For Musical Onset Detection
Bolt, James
Fazekas, Gyorgy
PROCEEDINGS OF THE 18TH INTERNATIONAL AUDIO MOSTLY CONFERENCE, AM 2023, 2023, : 130 - 135
[40] TSCL: Timestamp Supervised Contrastive Learning for Action Segmentation
Patsch, Constantin
Wu, Yuankai
Salihu, Driton
Zakour, Marsil
Steinbach, Eckehard
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (09): : 7485 - 7492

← 1 2 3 4 5 →