TransVCL: Attention-Enhanced Video Copy Localization Network with Flexible Supervision

被引:0
|
作者
He, Sifeng [1 ]
He, Yue [1 ]
Lu, Minlong [1 ]
Jiang, Chen [1 ]
Yang, Xudong [1 ]
Qian, Feng [1 ]
Zhang, Xiaobo [1 ]
Yang, Lei [1 ]
Zhang, Jiandong [2 ]
机构
[1] Ant Grp, Wuhan, Peoples R China
[2] Copyright Protect Ctr China, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video copy localization aims to precisely localize all the copied segments within a pair of untrimmed videos in video retrieval applications. Previous methods typically start from frame-to-frame similarity matrix generated by cosine similarity between frame-level features of the input video pair, and then detect and refine the boundaries of copied segments on similarity matrix under temporal constraints. In this paper, we propose TransVCL: an attention-enhanced video copy localization network, which is optimized directly from initial frame-level features and trained end-to-end with three main components: a customized Transformer for feature enhancement, a correlation and softmax layer for similarity matrix generation, and a temporal alignment module for copied segments localization. In contrast to previous methods demanding the handcrafted similarity matrix, TransVCL incorporates long-range temporal information between feature sequence pair using self- and cross- attention layers. With the joint design and optimization of three components, the similarity matrix can be learned to present more discriminative copied patterns, leading to significant improvements over previous methods on segment-level labeled datasets (VCSL and VCDB). Besides the state-of-the-art performance in fully supervised setting, the attention architecture facilitates TransVCL to further exploit unlabeled or simply video-level labeled data. Additional experiments of supplementing video-level labeled datasets including SVD and FIVR reveal the high flexibility of TransVCL from full supervision to semi-supervision (with or without video-level annotation). Code is publicly available at https://github.com/transvcl/TransVCL.
引用
下载
收藏
页码:799 / 807
页数:9
相关论文
共 50 条
  • [31] Negative Emotions Sensitive Humanoid Robot with Attention-Enhanced Facial Expression Recognition Network
    Ni, Rongrong
    Liu, Xiaofeng
    Chen, Yizhou
    Zhou, Xu
    Cai, Huili
    Kiong, Loo Chu
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2022, 34 (01): : 149 - 164
  • [32] Sea Clutter Amplitude Prediction via an Attention-Enhanced Seq2Seq Network
    Qu, Qizhe
    Chen, Hao
    Lei, Zhenshuo
    Li, Binbin
    Du, Qinglei
    Wang, Yongliang
    REMOTE SENSING, 2023, 15 (13)
  • [33] MATTER: A Multi-Level Attention-Enhanced Representation Learning Model for Network Intrusion Detection
    Lan, Jinghong
    Li, Yanan
    Li, Bo
    Liu, Xudong
    2022 IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS, TRUSTCOM, 2022, : 111 - 116
  • [34] An Efficient Channel Attention-Enhanced Lightweight Neural Network Model for Metal Surface Defect Detection
    Xie, Xikun
    Li, Changjiang
    Liu, Yang
    Song, Junjie
    Ahn, Jonghyun
    Zhang, Zhong
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2023, 32 (10)
  • [35] An attention-enhanced cross-task network to analyse lung nodule attributes in CT images
    Fu, Xiaohang
    Bi, Lei
    Kumar, Ashnil
    Fulham, Michael
    Kim, Jinman
    PATTERN RECOGNITION, 2022, 126
  • [36] Attention-enhanced feature mapping network for visible-infrared person re-identification
    Shuaiyi Liu
    Ke Han
    Machine Vision and Applications, 2025, 36 (2)
  • [37] Manifold attention-enhanced multi-domain convolutional network for decoding motor imagery intention
    Lu, Bin
    Huang, Xiaodong
    Chen, Junxiang
    Fu, Rongrong
    Wen, Guilin
    KNOWLEDGE-BASED SYSTEMS, 2024, 296
  • [38] Ranking surgical skills using an attention-enhanced Siamese network with piecewise aggregated kinematic data
    Ogul, Burcin Buket
    Gilgien, Matthias
    Ozdemir, Suat
    INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2022, 17 (06) : 1039 - 1048
  • [39] Distribution network state estimation based on attention-enhanced recurrent neural network pseudo-measurement modeling
    Wang, Yaojian
    Gu, Jie
    Yuan, Lyuzerui
    PROTECTION AND CONTROL OF MODERN POWER SYSTEMS, 2023, 8 (01)
  • [40] Distribution network state estimation based on attention-enhanced recurrent neural network pseudo-measurement modeling
    Yaojian Wang
    Jie Gu
    Lyuzerui Yuan
    Protection and Control of Modern Power Systems, 2023, 8