TransVCL: Attention-Enhanced Video Copy Localization Network with Flexible Supervision

被引：0

作者：

He, Sifeng ^{[1
]}

He, Yue ^{[1
]}

Lu, Minlong ^{[1
]}

Jiang, Chen ^{[1
]}

Yang, Xudong ^{[1
]}

Qian, Feng ^{[1
]}

Zhang, Xiaobo ^{[1
]}

Yang, Lei ^{[1
]}

Zhang, Jiandong ^{[2
]}

机构：

[1] Ant Grp, Wuhan, Peoples R China

[2] Copyright Protect Ctr China, Beijing, Peoples R China

来源：

THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1 | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Video copy localization aims to precisely localize all the copied segments within a pair of untrimmed videos in video retrieval applications. Previous methods typically start from frame-to-frame similarity matrix generated by cosine similarity between frame-level features of the input video pair, and then detect and refine the boundaries of copied segments on similarity matrix under temporal constraints. In this paper, we propose TransVCL: an attention-enhanced video copy localization network, which is optimized directly from initial frame-level features and trained end-to-end with three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for similarity matrix generation, and a temporal alignment module for copied segments localization. In contrast to previous methods demanding the handcrafted similarity matrix, TransVCL incorporates long-range temporal information between feature sequence pair using self- and cross- attention layers. With the joint design and optimization of three components, the similarity matrix can be learned to present more discriminative copied patterns, leading to significant improvements over previous methods on segment-level labeled datasets (VCSL and VCDB). Besides the state-of-the-art performance in fully supervised setting, the attention architecture facilitates TransVCL to further exploit unlabeled or simply video-level labeled data. Additional experiments of supplementing video-level labeled datasets including SVD and FIVR reveal the high flexibility of TransVCL from full supervision to semi-supervision (with or without video-level annotation). Code is publicly available at https://github.com/transvcl/TransVCL.

引用

下载

页码：799 / 807

页数：9

共 50 条

[31] Negative Emotions Sensitive Humanoid Robot with Attention-Enhanced Facial Expression Recognition Network
Ni, Rongrong
Liu, Xiaofeng
Chen, Yizhou
Zhou, Xu
Cai, Huili
Kiong, Loo Chu
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2022, 34 (01): : 149 - 164
[32] Sea Clutter Amplitude Prediction via an Attention-Enhanced Seq2Seq Network
Qu, Qizhe
Chen, Hao
Lei, Zhenshuo
Li, Binbin
Du, Qinglei
Wang, Yongliang
REMOTE SENSING, 2023, 15 (13)
[33] MATTER: A Multi-Level Attention-Enhanced Representation Learning Model for Network Intrusion Detection
Lan, Jinghong
Li, Yanan
Li, Bo
Liu, Xudong
2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 111 - 116
[34] An Efficient Channel Attention-Enhanced Lightweight Neural Network Model for Metal Surface Defect Detection
Xie, Xikun
Li, Changjiang
Liu, Yang
Song, Junjie
Ahn, Jonghyun
Zhang, Zhong
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (10)
[35] An attention-enhanced cross-task network to analyse lung nodule attributes in CT images
Fu, Xiaohang
Bi, Lei
Kumar, Ashnil
Fulham, Michael
Kim, Jinman
PATTERN RECOGNITION, 2022, 126
[36] Attention-enhanced feature mapping network for visible-infrared person re-identification
Shuaiyi Liu
Ke Han
Machine Vision and Applications, 2025, 36 (2)
[37] Manifold attention-enhanced multi-domain convolutional network for decoding motor imagery intention
Lu, Bin
Huang, Xiaodong
Chen, Junxiang
Fu, Rongrong
Wen, Guilin
KNOWLEDGE-BASED SYSTEMS, 2024, 296
[38] Ranking surgical skills using an attention-enhanced Siamese network with piecewise aggregated kinematic data
Ogul, Burcin Buket
Gilgien, Matthias
Ozdemir, Suat
INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2022, 17 (06) : 1039 - 1048
[39] Distribution network state estimation based on attention-enhanced recurrent neural network pseudo-measurement modeling
Wang, Yaojian
Gu, Jie
Yuan, Lyuzerui
PROTECTION AND CONTROL OF MODERN POWER SYSTEMS, 2023, 8 (01)
[40] Distribution network state estimation based on attention-enhanced recurrent neural network pseudo-measurement modeling
Yaojian Wang
Jie Gu
Lyuzerui Yuan
Protection and Control of Modern Power Systems, 2023, 8

← 1 2 3 4 5 →