Learn from Unlabeled Videos for Near-duplicate Video Retrieval

被引：4

作者：

He, Xiangteng ^{[1
]}

Pan, Yulin ^{[2
]}

Tang, Mingqian ^{[2
]}

Lv, Yiliang ^{[2
]}

Peng, Yuxin ^{[1
]}

机构：

[1] Peking Univ, Wangxuan Inst Comp Technol, Beijing, Peoples R China

[2] Alibaba Grp, Hangzhou, Peoples R China

来源：

PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22) | 2022年

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Near-duplicate Video Retrieval; Video Representation Learning; Similarity Search; LOCALIZATION; CNN;

D O I：

10.1145/3477495.3532010

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Near-duplicate video retrieval (NDVR) aims to find the copies or transformations of the query video from a massive video database. It plays an important role in many video related applications, including copyright protection, tracing, filtering and etc. Video representation and similarity search are crucial to any video retrieval system. To derive effective video representation, most video retrieval systems require a large amount of manually annotated data for training, making it costly inefficient. In addition, most retrieval systems are based on frame-level features for video similarity searching, making it expensive both storage wise and search wise. To address the above issues, we propose a video representation learning (VRL) approach to effectively address the above shortcomings. It first effectively learns video representation from unlabeled videos via contrastive learning to avoid the expensive cost of manual annotation. Then, it exploits transformer structure to aggregate frame-level features into clip-level to reduce both storage space and search complexity. It can learn the complementary and discriminative information from the interactions among clip frames, as well as acquire the frame permutation and missing invariant ability to support more flexible retrieval manners. Comprehensive experiments on two challenging near-duplicate video retrieval datasets, namely FIVR-200K and SVD, verify the effectiveness of our proposed VRL approach, which achieves the best performance of video retrieval on accuracy and efficiency.

引用

页码：1002 / 1011

页数：10

共 50 条

[31] On the Annotation of Web Videos by Efficient Near-Duplicate Search
Zhao, Wan-Lei
Wu, Xiao
Ngo, Chong-Wah
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (05) : 448 - 461
[32] Looking at Near-Duplicate Videos from a Human-Centric Perspective
De Oliveira, Rodrigo
Cherubini, Mauro
Oliver, Nuria
[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2010, 6 (03)
[33] Near-Duplicate Subsequence Matching for Video Streams
Chiu, Chih-Yi
Jhuang, Yi-Cheng
Han, Guei-Wun
Kang, Li-Wei
[J]. 2013 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA), 2013,
[34] Video Query Reformulation for Near-Duplicate Detection
Chiu, Chih-Yi
Li, Sheng-Yang
Hsieh, Cheng-Yu
[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2012, 7 (05) : 1594 - 1603
[35] An Efficient Method for Near-Duplicate Video Detection
Tahayna, Bashar
Belkhatir, Mohammed
[J]. ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2008, 9TH PACIFIC RIM CONFERENCE ON MULTIMEDIA, 2008, 5353 : 377 - 386
[36] Structure Tensor Series-Based Large Scale Near-Duplicate Video Retrieval
Zhou, Xiangmin
Chen, Lei
Zhou, Xiaofang
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2012, 14 (04) : 1220 - 1233
[37] Video block and FABEMD features for an effective and fast method of reporting near-duplicate and mirroring videos
Abderrahmane Adoui El Ouadrhiri
Said Jai-Andaloussi
Ouail Ouchetto
[J]. Journal of Big Data, 8
[38] Real-Time Retrieval of Near-Duplicate Fragments in Images and Video-Clips
Sluzek, Andrzej
Paradowski, Mariusz
[J]. ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS, PT I, 2010, 6474 : 18 - +
[39] Scalable Near-Duplicate Video Stream Monitoring
Chiu, Chih-Yi
Tsai, Tsung-Han
Hsieh, Cheng-Yu
[J]. IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATIONS SYSTEMS (ISPACS 2012), 2012,
[40] Global-view hashing: harnessing global relations in near-duplicate video retrieval
Weizhen Jing
Xiushan Nie
Chaoran Cui
Xiaoming Xi
Gongping Yang
Yilong Yin
[J]. World Wide Web, 2019, 22 : 771 - 789

← 1 2 3 4 5 →