Learning spatial-temporal features for video copy detection by the combination of CNN and RNN

被引:28
|
作者
Hu, Yaocong [1 ,2 ,3 ]
Lu, Xiaobo [1 ,2 ,3 ]
机构
[1] Southeast Univ, Coll Automat, Nanjing 210096, Jiangsu, Peoples R China
[2] Southeast Univ, Sch Automat, Nanjing 210096, Jiangsu, Peoples R China
[3] Southeast Univ, Minist Educ, Key Lab Measurement & Control Complex Syst Engn, Nanjing 210096, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Video copyright; CNN; Sequence matching; SiamesLSTM; CLASSIFICATION; WATERMARKING;
D O I
10.1016/j.jvcir.2018.05.013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Following the rapid developments of network multimedia, video copyright protection online has become a hot topic in recent researches. However, video copy detection is still a challenging task in the domain of video analysis and computer vision, due to the large variations in scale and illumination of the copied contents. In this paper, we propose a novel deep learning based approach, in which we jointly use the Convolution Neural Network (CNN) and Recurrent Neural Network (RNN) to solve the specific problem of detecting copied segments in videos. We first utilize a Residual Convolutional Neural Network(ResNet) to extract content features of frame-levels, and then employ a SiameseLSTM architecture for spatial-temporal fusion and sequence matching. Finally, the copied segments are detected by a graph based temporal network. We evaluate the performance of the proposed CNN-RNN based approach on a public large scale video copy dataset called VCDB, and the experiment results demonstrate the effectiveness and high robustness of our method which achieves the significant performance improvements compared to the state of the art.
引用
收藏
页码:21 / 29
页数:9
相关论文
共 50 条
  • [41] Emotion Classification Based on Transformer and CNN for EEG Spatial-Temporal Feature Learning
    Yao, Xiuzhen
    Li, Tianwen
    Ding, Peng
    Wang, Fan
    Zhao, Lei
    Gong, Anmin
    Nan, Wenya
    Fu, Yunfa
    BRAIN SCIENCES, 2024, 14 (03)
  • [42] STA-CNN: Convolutional Spatial-Temporal Attention Learning for Action Recognition
    Yang, Hao
    Yuan, Chunfeng
    Zhang, Li
    Sun, Yunda
    Hu, Weiming
    Maybank, Stephen J.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 5783 - 5793
  • [43] A Spatial-Temporal-Scale Registration Approach for Video Copy Detection
    Chen, Shi
    Wang, Tao
    Wang, Jinqiao
    Li, Jianguo
    Zhang, Yimin
    Lu, Hanqing
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2008, 9TH PACIFIC RIM CONFERENCE ON MULTIMEDIA, 2008, 5353 : 407 - +
  • [44] Video Description with Spatial-Temporal Attention
    Tu, Yunbin
    Zhang, Xishan
    Liu, Bingtao
    Yan, Chenggang
    PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, : 1014 - 1022
  • [45] Learning Image and Video Compression through Spatial-Temporal Energy Compaction
    Cheng, Zhengxue
    Sun, Heming
    Takeuchi, Masaru
    Katto, Jiro
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10063 - 10072
  • [46] A Dual-Branch Spatial-Temporal Learning Network for Video Prediction
    Huang, Huilin
    Guan, Yepeng
    IEEE ACCESS, 2024, 12 : 73258 - 73267
  • [47] Learning a spatial-temporal symmetry network for video super-resolution
    Wang, Xiaohang
    Liu, Mingliang
    Wei, Pengying
    APPLIED INTELLIGENCE, 2023, 53 (03) : 3530 - 3544
  • [48] Learning a spatial-temporal symmetry network for video super-resolution
    Xiaohang Wang
    Mingliang Liu
    Pengying Wei
    Applied Intelligence, 2023, 53 : 3530 - 3544
  • [49] Spatial-Temporal Cascade Autoencoder for Video Anomaly Detection in Crowded Scenes
    Li, Nanjun
    Chang, Faliang
    Liu, Chunsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 203 - 215
  • [50] Learning Efficient Spatial-Temporal Gait Features with Deep Learning for Human Identification
    Liu, Wu
    Zhang, Cheng
    Ma, Huadong
    Li, Shuangqun
    NEUROINFORMATICS, 2018, 16 (3-4) : 457 - 471