Learning spatial-temporal features for video copy detection by the combination of CNN and RNN

被引：28

作者：

Hu, Yaocong ^{[1
,2
,3
]}

Lu, Xiaobo ^{[1
,2
,3
]}

机构：

[1] Southeast Univ, Coll Automat, Nanjing 210096, Jiangsu, Peoples R China

[2] Southeast Univ, Sch Automat, Nanjing 210096, Jiangsu, Peoples R China

[3] Southeast Univ, Minist Educ, Key Lab Measurement & Control Complex Syst Engn, Nanjing 210096, Jiangsu, Peoples R China

来源：

JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION | 2018年 / 55卷

基金：

中国国家自然科学基金;

关键词：

Video copyright; CNN; Sequence matching; SiamesLSTM; CLASSIFICATION; WATERMARKING;

D O I：

10.1016/j.jvcir.2018.05.013

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Following the rapid developments of network multimedia, video copyright protection online has become a hot topic in recent researches. However, video copy detection is still a challenging task in the domain of video analysis and computer vision, due to the large variations in scale and illumination of the copied contents. In this paper, we propose a novel deep learning based approach, in which we jointly use the Convolution Neural Network (CNN) and Recurrent Neural Network (RNN) to solve the specific problem of detecting copied segments in videos. We first utilize a Residual Convolutional Neural Network(ResNet) to extract content features of frame-levels, and then employ a SiameseLSTM architecture for spatial-temporal fusion and sequence matching. Finally, the copied segments are detected by a graph based temporal network. We evaluate the performance of the proposed CNN-RNN based approach on a public large scale video copy dataset called VCDB, and the experiment results demonstrate the effectiveness and high robustness of our method which achieves the significant performance improvements compared to the state of the art.

引用

页码：21 / 29

页数：9

共 50 条

[31] An Efficient Spatial-Temporal Polyp Detection Framework for Colonoscopy Video
Zhang, Pengfei
Sun, Xinzi
Wang, Dechun
Wang, Xizhe
Cao, Yu
Liu, Benyuan
2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 1252 - 1259
[32] Multisensor video fusion based on spatial-temporal salience detection
Zhang, Qiang
Chen, Yueling
Wang, Long
SIGNAL PROCESSING, 2013, 93 (09) : 2485 - 2499
[33] SPATIAL-TEMPORAL FEATURE AGGREGATION NETWORK FOR VIDEO OBJECT DETECTION
Chen, Zhu
Li, Weihai
Fei, Chi
Liu, Bin
Yu, Nenghai
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 1858 - 1862
[34] Multilevel Spatial-Temporal Feature Aggregation for Video Object Detection
Xu, Chao
Zhang, Jiangning
Wang, Mengmeng
Tian, Guanzhong
Liu, Yong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (11) : 7809 - 7820
[35] Pig mounting behaviour recognition based on video spatial-temporal features
Yang, Qiumei
Xiao, Deqin
Cai, Jiahao
BIOSYSTEMS ENGINEERING, 2021, 206 : 55 - 66
[36] MPEG-2 Video Copy Detection Method Based on Sparse Representation of Spatial and Temporal Features
Ren, Dongyue
Zhuo, Li
Long, Haixia
Qu, Panting
Zhang, Jing
2016 IEEE SECOND INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2016, : 233 - 236
[37] Moving object detection in combination of CenSurE and spatial-temporal information
Zhang, H.-Y. (carole_zhang0716@163.com), 1600, Chinese Academy of Sciences (21):
[38] Learning spatial-temporal representation for smoke vehicle detection
Cao, Yichao
Lu, Xiaobo
MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (19) : 27871 - 27889
[39] Learning spatial-temporal representation for smoke vehicle detection
Yichao Cao
Xiaobo Lu
Multimedia Tools and Applications, 2019, 78 : 27871 - 27889
[40] ABNORMAL BEHAVIOR DETECTION BA SED ON SPATIAL-TEMPORAL FEATURES
Xiang, Jinhai
Fan, Heng
Xu, Jun
PROCEEDINGS OF 2013 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOLS 1-4, 2013, : 871 - 876

← 1 2 3 4 5 →