Learn from Unlabeled Videos for Near-duplicate Video Retrieval

被引:4
|
作者
He, Xiangteng [1 ]
Pan, Yulin [2 ]
Tang, Mingqian [2 ]
Lv, Yiliang [2 ]
Peng, Yuxin [1 ]
机构
[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Near-duplicate Video Retrieval; Video Representation Learning; Similarity Search; LOCALIZATION; CNN;
D O I
10.1145/3477495.3532010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Near-duplicate video retrieval (NDVR) aims to find the copies or transformations of the query video from a massive video database. It plays an important role in many video related applications, including copyright protection, tracing, filtering and etc. Video representation and similarity search are crucial to any video retrieval system. To derive effective video representation, most video retrieval systems require a large amount of manually annotated data for training, making it costly inefficient. In addition, most retrieval systems are based on frame-level features for video similarity searching, making it expensive both storage wise and search wise. To address the above issues, we propose a video representation learning (VRL) approach to effectively address the above shortcomings. It first effectively learns video representation from unlabeled videos via contrastive learning to avoid the expensive cost of manual annotation. Then, it exploits transformer structure to aggregate frame-level features into clip-level to reduce both storage space and search complexity. It can learn the complementary and discriminative information from the interactions among clip frames, as well as acquire the frame permutation and missing invariant ability to support more flexible retrieval manners. Comprehensive experiments on two challenging near-duplicate video retrieval datasets, namely FIVR-200K and SVD, verify the effectiveness of our proposed VRL approach, which achieves the best performance of video retrieval on accuracy and efficiency.
引用
收藏
页码:1002 / 1011
页数:10
相关论文
共 50 条
  • [1] Near-Duplicate Video Retrieval with Deep Metric Learning
    Kordopatis-Zilos, Giorgos
    Papadopoulos, Symeon
    Patras, Ioannis
    Kompatsiaris, Yiannis
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 347 - 356
  • [2] SHOT AGGREGATING STRATEGY FOR NEAR-DUPLICATE VIDEO RETRIEVAL
    Srinivasan, Vignesh
    Lefebvre, Frederic
    Ozerov, Alexey
    [J]. 2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1825 - 1829
  • [3] Advance on large scale near-duplicate video retrieval
    Shen, Ling
    Hong, Richang
    Hao, Yanbin
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2020, 14 (05)
  • [4] Advance on large scale near-duplicate video retrieval
    Ling Shen
    Richang Hong
    Yanbin Hao
    [J]. Frontiers of Computer Science, 2020, 14
  • [5] Pattern-Based Near-Duplicate Video Retrieval and Localization on Web-Scale Videos
    Chou, Chien-Li
    Chen, Hua-Tsung
    Lee, Suh-Yin
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2015, 17 (03) : 382 - 395
  • [6] Human Perception of Near-Duplicate Videos
    de Oliveira, Rodrigo
    Cherubini, Mauro
    Oliver, Nuria
    [J]. HUMAN-COMPUTER INTERACTION - INTERACT 2009, PT II, PROCEEDINGS, 2009, 5727 : 21 - 24
  • [7] A Near-Duplicate Video Retrieval Method Based on Zernike Moments
    Chang, Tang-You
    Tai, Shen-Chuan
    Lin, Guo-Shiang
    [J]. 2015 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2015, : 860 - 864
  • [8] Joint Compression of Near-Duplicate Videos
    Wang, Hanli
    Tian, Tao
    Ma, Ming
    Wu, Jun
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (05) : 908 - 920
  • [9] Correlation-Based Retrieval for Heavily Changed Near-Duplicate Videos
    Liu, Jiajun
    Huang, Zi
    Shen, Heng Tao
    Cui, Bin
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2011, 29 (04)
  • [10] Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
    Kordopatis-Zilos, Giorgos
    Papadopoulos, Symeon
    Patras, Ioannis
    Kompatsiaris, Yiannis
    [J]. MULTIMEDIA MODELING (MMM 2017), PT I, 2017, 10132 : 251 - 263