Locality-Sensitive Hashing for Information Retrieval System on Multiple GPGPU Devices

被引:7
|
作者
Toan Nguyen Mau [1 ]
Inoguchi, Yasushi [2 ]
机构
[1] Japan Adv Inst Sci & Technol, Grad Sch Informat Sci, Inoguchi Lab, 1-1 Asahidai, Nomi, Ishikawa 9231211, Japan
[2] Japan Adv Inst Sci & Technol, Res Ctr Adv Comp Infrastruct, 1-1 Asahidai, Nomi, Ishikawa 9231211, Japan
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 07期
关键词
locality-sensitive hashing; structured data set; GPGPU; similarity searching; parallel processing; distributed memory; NEAREST-NEIGHBOR;
D O I
10.3390/app10072539
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
It is challenging to build a real-time information retrieval system, especially for systems with high-dimensional big data. To structure big data, many hashing algorithms that map similar data items to the same bucket to advance the search have been proposed. Locality-Sensitive Hashing (LSH) is a common approach for reducing the number of dimensions of a data set, by using a family of hash functions and a hash table. The LSH hash table is an additional component that supports the indexing of hash values (keys) for the corresponding data/items. We previously proposed the Dynamic Locality-Sensitive Hashing (DLSH) algorithm with a dynamically structured hash table, optimized for storage in the main memory and General-Purpose computation on Graphics Processing Units (GPGPU) memory. This supports the handling of constantly updated data sets, such as songs, images, or text databases. The DLSH algorithm works effectively with data sets that are updated with high frequency and is compatible with parallel processing. However, the use of a single GPGPU device for processing big data is inadequate, due to the small memory capacity of GPGPU devices. When using multiple GPGPU devices for searching, we need an effective search algorithm to balance the jobs. In this paper, we propose an extension of DLSH for big data sets using multiple GPGPUs, in order to increase the capacity and performance of the information retrieval system. Different search strategies on multiple DLSH clusters are also proposed to adapt our parallelized system. With significant results in terms of performance and accuracy, we show that DLSH can be applied to real-life dynamic database systems.
引用
收藏
页数:18
相关论文
共 50 条
  • [21] Cross-media retrieval based on locality-sensitive hashing and neural network algorithms
    Bai L.
    Jia Y.
    Wang H.
    Xie Y.
    Yu T.
    2018, National University of Defense Technology (40): : 93 - 98
  • [22] Locality-Sensitive Hashing Optimizations for Fast Malware Clustering
    Oprisa, Ciprian
    Checiches, Marius
    Nandrean, Adrian
    2014 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2014, : 97 - +
  • [23] Fast Redescription Mining Using Locality-Sensitive Hashing
    Karjalainen, Maiju
    Galbrun, Esther
    Miettinen, Pauli
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT VII, ECML PKDD 2024, 2024, 14947 : 124 - 142
  • [24] Locality-Sensitive Hashing for Chi2 Distance
    Gorisse, David
    Cord, Matthieu
    Precioso, Frederic
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (02) : 402 - 409
  • [25] Learnable Locality-Sensitive Hashing for Video Anomaly Detection
    Lu, Yue
    Cao, Congqi
    Zhang, Yifan
    Zhang, Yanning
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (02) : 963 - 976
  • [26] Locality-Sensitive Hashing Without False Negatives for lp
    Pacuk, Andrzej
    Sankowski, Piotr
    Wegrzycki, Karol
    Wygocki, Piotr
    COMPUTING AND COMBINATORICS, COCOON 2016, 2016, 9797 : 105 - 118
  • [27] Kernelized Locality-Sensitive Hashing for Scalable Image Search
    Kulis, Brian
    Grauman, Kristen
    2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, : 2130 - 2137
  • [29] A Locality-Sensitive Hashing-Based Jamming Detection System for IoT Networks
    Ganeshkumar, P.
    Albalawi, Talal
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 73 (03): : 5943 - 5959
  • [30] Big Data Retrieval Using Locality-Sensitive Hashing with Document-Based NoSQL Database
    Gayathiri, N. R.
    Natarajan, A. M.
    IETE JOURNAL OF RESEARCH, 2021, 67 (06) : 969 - 978