Locality-Sensitive Hashing for Information Retrieval System on Multiple GPGPU Devices

被引:7
|
作者
Toan Nguyen Mau [1 ]
Inoguchi, Yasushi [2 ]
机构
[1] Japan Adv Inst Sci & Technol, Grad Sch Informat Sci, Inoguchi Lab, 1-1 Asahidai, Nomi, Ishikawa 9231211, Japan
[2] Japan Adv Inst Sci & Technol, Res Ctr Adv Comp Infrastruct, 1-1 Asahidai, Nomi, Ishikawa 9231211, Japan
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 07期
关键词
locality-sensitive hashing; structured data set; GPGPU; similarity searching; parallel processing; distributed memory; NEAREST-NEIGHBOR;
D O I
10.3390/app10072539
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
It is challenging to build a real-time information retrieval system, especially for systems with high-dimensional big data. To structure big data, many hashing algorithms that map similar data items to the same bucket to advance the search have been proposed. Locality-Sensitive Hashing (LSH) is a common approach for reducing the number of dimensions of a data set, by using a family of hash functions and a hash table. The LSH hash table is an additional component that supports the indexing of hash values (keys) for the corresponding data/items. We previously proposed the Dynamic Locality-Sensitive Hashing (DLSH) algorithm with a dynamically structured hash table, optimized for storage in the main memory and General-Purpose computation on Graphics Processing Units (GPGPU) memory. This supports the handling of constantly updated data sets, such as songs, images, or text databases. The DLSH algorithm works effectively with data sets that are updated with high frequency and is compatible with parallel processing. However, the use of a single GPGPU device for processing big data is inadequate, due to the small memory capacity of GPGPU devices. When using multiple GPGPU devices for searching, we need an effective search algorithm to balance the jobs. In this paper, we propose an extension of DLSH for big data sets using multiple GPGPUs, in order to increase the capacity and performance of the information retrieval system. Different search strategies on multiple DLSH clusters are also proposed to adapt our parallelized system. With significant results in terms of performance and accuracy, we show that DLSH can be applied to real-life dynamic database systems.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Open Environmental Locality-sensitive Hashing Retrieval for Multiple Distributed Characteristics
    Zhang, Shi
    Lai, Hui-Xia
    Xiao, Ru-Liang
    Pan, Miao-Xin
    Zhang, Lu-Lu
    Chen, Wei-Lin
    Ruan Jian Xue Bao/Journal of Software, 2022, 33 (04): : 1200 - 1217
  • [2] In Defense of Locality-Sensitive Hashing
    Ding, Kun
    Huo, Chunlei
    Fan, Bin
    Xiang, Shiming
    Pan, Chunhong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (01) : 87 - 103
  • [3] Kernelized Locality-Sensitive Hashing
    Kulis, Brian
    Grauman, Kristen
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (06) : 1092 - 1104
  • [4] Correlated Locality-Sensitive Hashing
    Pagh, Rasmus
    ALGORITHMS - ESA 2015, 2015, 9294
  • [5] Stratified Locality-Sensitive Hashing for Accelerated Physiological Time Series Retrieval
    Kim, Yongwook Bryce
    Hemberg, Erik
    O'Reilly, Una-May
    2016 38TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2016, : 2479 - 2483
  • [6] An Improved Algorithm for Locality-Sensitive Hashing
    Cen, Wei
    Miao, Kehua
    10TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION (ICCSE 2015), 2015, : 61 - 64
  • [7] Bit Reduction for Locality-Sensitive Hashing
    Liu, Huawen
    Zhou, Wenhua
    Zhang, Hong
    Li, Gang
    Zhang, Shichao
    Li, Xuelong
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (09) : 12470 - 12481
  • [8] Optimal Parameters for Locality-Sensitive Hashing
    Slaney, Malcolm
    Lifshits, Yury
    He, Junfeng
    PROCEEDINGS OF THE IEEE, 2012, 100 (09) : 2604 - 2623
  • [9] Locality-sensitive hashing for the edit distance
    Marcais, Guillaume
    DeBlasio, Dan
    Pandey, Prashant
    Kingsford, Carl
    BIOINFORMATICS, 2019, 35 (14) : I127 - I135
  • [10] A Scalable ECG Identification System Based on Locality-Sensitive Hashing
    Chu, Hui-Yu
    Lin, Tzu-Yun
    Lee, Song-Hong
    Chiu, Jui-Kun
    Nien, Cing-Ping
    Wu, Shun-Chi
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,