Efficient and Robust Detection of Duplicate Videos in a Large Database

被引:23
|
作者
Sarkar, Anindya [1 ]
Singh, Vishwarkarma [2 ]
Ghosh, Pratim [1 ]
Manjunath, Bangalore S. [1 ]
Singh, Ambuj [2 ]
机构
[1] Univ Calif Santa Barbara, Dept Elect & Comp Engn, Santa Barbara, CA 93106 USA
[2] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
基金
美国国家科学基金会;
关键词
Color layout descriptor (CLD); duplicate detection; nonmetric distance; vector quantization (VQ); video fingerprinting; HISTOGRAM;
D O I
10.1109/TCSVT.2010.2046056
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We present an efficient and accurate method for duplicate video detection in a large database using video fingerprints. We have empirically chosen the color layout descriptor, a compact and robust frame-based descriptor, to create fingerprints which are further encoded by vector quantization (VQ). We propose a new nonmetric distance measure to find the similarity between the query and a database video fingerprint and experimentally show its superior performance over other distance measures for accurate duplicate detection. Efficient search cannot be performed for high-dimensional data using a nonmetric distance measure with existing indexing techniques. Therefore, we develop novel search algorithms based on precomputed distances and new dataset pruning techniques yielding practical retrieval times. We perform experiments with a database of 38 000 videos, worth 1600 h of content. For individual queries with an average duration of 60 s (about 50% of the average database video length), the duplicate video is retrieved in 0.032 s, on Intel Xeon with CPU 2.33 GHz, with a very high accuracy of 97.5%.
引用
收藏
页码:870 / 885
页数:16
相关论文
共 50 条
  • [1] Duplicate detection and record consolidation in large bibliographic databases: the COPAC database experience
    Cousins, SA
    JOURNAL OF INFORMATION SCIENCE, 1998, 24 (04) : 231 - 240
  • [2] DUPLICATE RECORD DETECTION FOR DATABASE CLEANSING
    Rehman, Mariam
    Esichaikul, Vatcharapon
    2009 SECOND INTERNATIONAL CONFERENCE ON MACHINE VISION, PROCEEDINGS, ( ICMV 2009), 2009, : 333 - 338
  • [3] VCDB: A Large-Scale Database for Partial Copy Detection in Videos
    Jiang, Yu-Gang
    Jiang, Yudong
    Wang, Jiajun
    COMPUTER VISION - ECCV 2014, PT IV, 2014, 8692 : 357 - 371
  • [4] On the Annotation of Web Videos by Efficient Near-Duplicate Search
    Zhao, Wan-Lei
    Wu, Xiao
    Ngo, Chong-Wah
    IEEE TRANSACTIONS ON MULTIMEDIA, 2010, 12 (05) : 448 - 461
  • [5] Efficient Large Scale Near-Duplicate Video Detection Base on Spark
    Lv, Jinna
    Wu, Bin
    Yang, Shuai
    Jia, Bingjing
    Qiu, Peigang
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 957 - 962
  • [6] PC-Filter: A robust filtering technique for duplicate record detection in large databases
    Zhang, J
    Ling, TW
    Bruckner, RM
    Liu, H
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2004, 3180 : 486 - 496
  • [7] Efficient approach for detecting approximately duplicate database records
    Qiu, Y.F.
    Tian, Z.P.
    Ji, W.Y.
    Zhou, A.Y.
    Jisuanji Xuebao/Chinese Journal of Computers, 2001, 24 (01): : 69 - 77
  • [8] Efficient and exact duplicate detection on cloud
    Rong, Chuitian
    Lu, Wei
    Du, Xiaoyong
    Zhang, Xiao
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2013, 25 (15): : 2187 - 2206
  • [9] Fast and robust duplicate image detection on the web
    Gadeski, Etienne
    Le Borgne, Herve
    Popescu, Adrian
    MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (09) : 11839 - 11858
  • [10] Fast and robust duplicate image detection on the web
    Etienne Gadeski
    Hervé Le Borgne
    Adrian Popescu
    Multimedia Tools and Applications, 2017, 76 : 11839 - 11858