Effective near-duplicate image detection using perceptual hashing and deep learning

被引:0
|
作者
Jakhar, Yash [1 ]
Borah, Malaya Dutta [1 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Silchar, India
关键词
Near-duplicate images; Neural network; Generative Adversarial Network; Perceptual hashing; Siamese network; Vision Transformer;
D O I
10.1016/j.ipm.2025.104086
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Computer vision has always been concerned with near-duplicate image detection. Previous approaches for detecting near duplicates highlighted the necessity to adequately explore the aspect of image transformations for effectively handling complex images. We proposed a method of finding near duplicate images using the integration of three different techniques: perceptual hashing, Siamese network, and Vision Transformer. Perceptual hashing gives us a quick way to filter out similar-looking pictures, while the Siamese network architecture paired with the Vision transformer helps us identify more complex near duplicate instances. The integrated approach learns a metric space from data, which reflects both visual similarity and perceptual closeness among items in the dataset. The results demonstrate the effectiveness and robustness of our proposed method, achieving an AUROC of 0.99 and a precision of 0.987 on the California- ND dataset, and an AUROC of 0.92 with a precision of 0.884 on the INRIA Holidays dataset, significantly outperforming traditional methods by over 10% in both metrics. This represents a significant step forward in near-duplicate image detection research.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Deep Learning in the Domain of Near-Duplicate Document Detection
    Roul, Rajendra Kumar
    BIG DATA ANALYTICS (BDA 2019), 2019, 11932 : 439 - 459
  • [2] BASIL: Effective Near-Duplicate Image Detection Using Gene Sequence Alignment
    Kim, Hung-sik
    Chang, Hau-Wen
    Lee, Jeongkyu
    Lee, Dongwon
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2010, 5993 : 229 - +
  • [3] Fast Near-duplicate Image Detection in Riemannian Space by A Novel Hashing Scheme
    Zheng, Ligang
    Song, Chao
    CMC-COMPUTERS MATERIALS & CONTINUA, 2018, 56 (03): : 529 - 539
  • [4] Near-duplicate image detection based on wavelet decomposition with modified deep learning model
    Mehta, Preeti
    Singh, Mahesh K.
    Singha, Nitin
    JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (02) : 23017
  • [5] EFFICIENT NEAR-DUPLICATE IMAGE DETECTION BY LEARNING FROM EXAMPLES
    Hu, Yang
    Li, Mingjing
    Yu, Nenghai
    2008 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, VOLS 1-4, 2008, : 657 - +
  • [6] Filtering Image Spam using Image Semantics and Near-Duplicate Detection
    Qu, Zhaoyang
    Zhang, Yingjin
    ICICTA: 2009 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL I, PROCEEDINGS, 2009, : 600 - 603
  • [7] An Integrated Approach to Near-duplicate Image Detection
    Yang, Heesung
    Park, Hyeyoung
    2023 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION, ICAIIC, 2023, : 425 - 428
  • [8] Using redundant bit vectors for near-duplicate image detection
    Foo, Jun Jie
    Sinha, Ranjan
    ADVANCES IN DATABASES: CONCEPTS, SYSTEMS AND APPLICATIONS, 2007, 4443 : 472 - +
  • [9] Benchmarking unsupervised near-duplicate image detection
    Morra, Lia
    Lamberti, Fabrizio
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 135 : 313 - 326
  • [10] Fast Near-Duplicate Image Detection Using Uniform Randomized Trees
    Lei, Yanqiang
    Qiu, Guoping
    Zheng, Ligang
    Huang, Jiwu
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2014, 10 (04)