Building a web-scale image similarity search system

被引:10
|
作者
Batko, Michal [1 ]
Falchi, Fabrizio [2 ]
Lucchese, Claudio [2 ]
Novak, David [1 ]
Perego, Raffaele [2 ]
Rabitti, Fausto [2 ]
Sedmidubsky, Jan [1 ]
Zezula, Pavel [1 ]
机构
[1] Masaryk Univ, Fac Informat, Brno, Czech Republic
[2] CNR, ISTI, I-56100 Pisa, Italy
关键词
Similarity search; Content-based image retrieval; Metric space; MPEG-7; descriptors; Peer-to-peer search network; IMPLEMENTATION;
D O I
10.1007/s11042-009-0339-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As the number of digital images is growing fast and Content-based Image Retrieval (CBIR) is gaining in popularity, CBIR systems should leap towards Web-scale datasets. In this paper, we report on our experience in building an experimental similarity search system on a test collection of more than 50 million images. The first big challenge we have been facing was obtaining a collection of images of this scale with the corresponding descriptive features. We have tackled the non-trivial process of image crawling and extraction of several MPEG-7 descriptors. The result of this effort is a test collection, the first of such scale, opened to the research community for experiments and comparisons. The second challenge was to develop indexing and searching mechanisms able to scale to the target size and to answer similarity queries in real-time. We have achieved this goal by creating sophisticated centralized and distributed structures based purely on the metric space model of data. We have joined them together which has resulted in an extremely flexible and scalable solution. In this paper, we study in detail the performance of this technology and its evolvement as the data volume grows by three orders of magnitude. The results of the experiments are very encouraging and promising for future applications.
引用
收藏
页码:599 / 629
页数:31
相关论文
共 50 条
  • [1] Building a web-scale image similarity search system
    Michal Batko
    Fabrizio Falchi
    Claudio Lucchese
    David Novak
    Raffaele Perego
    Fausto Rabitti
    Jan Sedmidubsky
    Pavel Zezula
    Multimedia Tools and Applications, 2010, 47 : 599 - 629
  • [2] Web-scale system for image similarity search: When the dreams are coming true
    Novak, David
    Batko, Michal
    Zezula, Pavel
    2008 INTERNATIONAL WORKSHOP ON CONTENT-BASED MULTIMEDIA INDEXING, 2008, : 430 - 437
  • [3] Building web-scale data mining infrastructure for search
    Ma, Wei-Ying
    PROGRESS IN WWW RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2008, 4976 : 9 - 9
  • [4] Web-Scale Image Annotation
    Liu, Jiakai
    Hu, Rong
    Wang, Meihong
    Wang, Yi
    Chang, Edward Y.
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2008, 9TH PACIFIC RIM CONFERENCE ON MULTIMEDIA, 2008, 5353 : 663 - 674
  • [5] Web-scale image clustering revisited
    Avrithis, Yannis
    Kalantidis, Yannis
    Anagnostopoulos, Evangelos
    Emiris, Ioannis Z.
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1502 - 1510
  • [6] Evolution of a Web-Scale Near Duplicate Image Detection System
    Gusev, Andrey
    Xu, Jiajing
    WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 2733 - 2739
  • [7] Duplicate-Search-Based Image Annotation Using Web-Scale Data
    Wang, Xin-Jing
    Zhang, Lei
    Ma, Wei-Ying
    PROCEEDINGS OF THE IEEE, 2012, 100 (09) : 2705 - 2721
  • [8] Web-Scale Responsive Visual Search at Bing
    Hu, Houdong
    Wang, Yan
    Yang, Linjun
    Komlev, Pavel
    Huang, Li
    Chen, Xi
    Huang, Jiapei
    Wu, Ye
    Merchant, Meenaz
    Sacheti, Arun
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 359 - 367
  • [9] Web-scale Multimedia Search for Internet Video Content
    Jiang, Lu
    PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16 COMPANION), 2016, : 311 - 316
  • [10] Web-scale Multimedia Search for Internet Video Content
    Jiang, Lu
    PROCEEDINGS OF THE NINTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'16), 2016, : 701 - 701