Learn from Unlabeled Videos for Near-duplicate Video Retrieval

被引:4
|
作者
He, Xiangteng [1 ]
Pan, Yulin [2 ]
Tang, Mingqian [2 ]
Lv, Yiliang [2 ]
Peng, Yuxin [1 ]
机构
[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Near-duplicate Video Retrieval; Video Representation Learning; Similarity Search; LOCALIZATION; CNN;
D O I
10.1145/3477495.3532010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Near-duplicate video retrieval (NDVR) aims to find the copies or transformations of the query video from a massive video database. It plays an important role in many video related applications, including copyright protection, tracing, filtering and etc. Video representation and similarity search are crucial to any video retrieval system. To derive effective video representation, most video retrieval systems require a large amount of manually annotated data for training, making it costly inefficient. In addition, most retrieval systems are based on frame-level features for video similarity searching, making it expensive both storage wise and search wise. To address the above issues, we propose a video representation learning (VRL) approach to effectively address the above shortcomings. It first effectively learns video representation from unlabeled videos via contrastive learning to avoid the expensive cost of manual annotation. Then, it exploits transformer structure to aggregate frame-level features into clip-level to reduce both storage space and search complexity. It can learn the complementary and discriminative information from the interactions among clip frames, as well as acquire the frame permutation and missing invariant ability to support more flexible retrieval manners. Comprehensive experiments on two challenging near-duplicate video retrieval datasets, namely FIVR-200K and SVD, verify the effectiveness of our proposed VRL approach, which achieves the best performance of video retrieval on accuracy and efficiency.
引用
收藏
页码:1002 / 1011
页数:10
相关论文
共 50 条
  • [31] On the Annotation of Web Videos by Efficient Near-Duplicate Search
    Zhao, Wan-Lei
    Wu, Xiao
    Ngo, Chong-Wah
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (05) : 448 - 461
  • [32] Looking at Near-Duplicate Videos from a Human-Centric Perspective
    De Oliveira, Rodrigo
    Cherubini, Mauro
    Oliver, Nuria
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2010, 6 (03)
  • [33] Near-Duplicate Subsequence Matching for Video Streams
    Chiu, Chih-Yi
    Jhuang, Yi-Cheng
    Han, Guei-Wun
    Kang, Li-Wei
    [J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
  • [34] Video Query Reformulation for Near-Duplicate Detection
    Chiu, Chih-Yi
    Li, Sheng-Yang
    Hsieh, Cheng-Yu
    [J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2012, 7 (05) : 1594 - 1603
  • [35] An Efficient Method for Near-Duplicate Video Detection
    Tahayna, Bashar
    Belkhatir, Mohammed
    [J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2008, 9TH PACIFIC RIM CONFERENCE ON MULTIMEDIA, 2008, 5353 : 377 - 386
  • [36] Structure Tensor Series-Based Large Scale Near-Duplicate Video Retrieval
    Zhou, Xiangmin
    Chen, Lei
    Zhou, Xiaofang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2012, 14 (04) : 1220 - 1233
  • [37] Video block and FABEMD features for an effective and fast method of reporting near-duplicate and mirroring videos
    Abderrahmane Adoui El Ouadrhiri
    Said Jai-Andaloussi
    Ouail Ouchetto
    [J]. Journal of Big Data, 8
  • [38] Real-Time Retrieval of Near-Duplicate Fragments in Images and Video-Clips
    Sluzek, Andrzej
    Paradowski, Mariusz
    [J]. ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, PT I, 2010, 6474 : 18 - +
  • [39] Scalable Near-Duplicate Video Stream Monitoring
    Chiu, Chih-Yi
    Tsai, Tsung-Han
    Hsieh, Cheng-Yu
    [J]. IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS SYSTEMS (ISPACS 2012), 2012,
  • [40] Global-view hashing: harnessing global relations in near-duplicate video retrieval
    Weizhen Jing
    Xiushan Nie
    Chaoran Cui
    Xiaoming Xi
    Gongping Yang
    Yilong Yin
    [J]. World Wide Web, 2019, 22 : 771 - 789