DASH: Data Aware Locality Sensitive Hashing

被引:0
|
作者
Tan, Zongyuan [1 ,3 ]
Wang, Hongya [1 ,2 ,3 ]
Du, Ming [1 ]
Zhang, Jie [1 ]
机构
[1] Donghua Univ, Sch Comp Sci & Technol, Shanghai, Peoples R China
[2] Chinese Acad Sci, State Key Lab Comp Architecture, ICT, Beijing, Peoples R China
[3] Shanghai Key Lab Comp Software Evaluating & Testi, Shanghai, Peoples R China
来源
关键词
LSH; ANNS; High dimensions; Data-dependent hashing; PRODUCT QUANTIZATION;
D O I
10.1007/978-3-031-25198-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Locality sensitive hashing (LSH) has been extensively employed to solve the problem of c-approximate nearest neighbor search (c-ANNS) in high-dimensional spaces. However, the search performance of LSH is degenerated with the number of data increasing. To this end, we propose an efficient method called Data Aware Sensitive Hashing (DASH) to deal with this drawback. DASH is the data-dependent hashing algorithm under considering the residual distance prior. DASH leverages this prior knowledge and provides theoretical guarantee for search results. Our experimental results with various datasets show that DASH achieves better search performance and the running time can reach up to about 4-40x speedups compared with other state-of-the-art methods.
引用
收藏
页码:85 / 100
页数:16
相关论文
共 50 条
  • [41] On locality sensitive hashing for sampling extent generators
    Codocedo, Victor
    Tang, My Thao
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, 10352 LNAI : 632 - 641
  • [42] Dynamic Whitelisting Using Locality Sensitive Hashing
    Pryde, Jayson
    Angeles, Nestle
    Carinan, Sheryl Kareen
    TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING: PAKDD 2018 WORKSHOPS, 2018, 11154 : 181 - 185
  • [43] P-QALSH plus : Exploiting Multiple Cores to Parallelize Query-Aware Locality-Sensitive Hashing on Big Data
    Huang, Yikai
    Hu, Zezhao
    Feng, Jianlin
    WEB AND BIG DATA, PT II, APWEB-WAIM 2023, 2024, 14332 : 28 - 43
  • [44] Large-Scale Distributed Locality-Sensitive Hashing for General Metric Data
    Silva, Eliezer
    Teixeira, Thiago
    Teodoro, George
    Valle, Eduardo
    SIMILARITY SEARCH AND APPLICATIONS, 2014, 8821 : 82 - 93
  • [45] Efficient locality-sensitive hashing over high-dimensional streaming data
    Wang, Hao
    Yang, Chengcheng
    Zhang, Xiangliang
    Gao, Xin
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (05): : 3753 - 3766
  • [46] Efficient Locality-Sensitive Hashing Over High-Dimensional Data Streams
    Yang, Chengcheng
    Deng, Dong
    Shang, Shuo
    Shao, Ling
    2020 IEEE 36TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2020), 2020, : 1994 - 1997
  • [47] MapReduce Based Personalized Locality Sensitive Hashing for Similarity Joins on Large Scale Data
    Wang, Jingjing
    Lin, Chen
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2015, 2015
  • [48] Efficient Data Stream Clustering with Sliding Windows based on Locality-Sensitive Hashing
    Youn, Jonghem
    Shim, Junho
    Lee, Sang-Goo
    IEEE ACCESS, 2018, 6 : 63757 - 63776
  • [49] Parallel set similarity join on big data based on Locality-Sensitive Hashing
    Sohrabi, Mohammad Karim
    Azgomi, Hosseion
    SCIENCE OF COMPUTER PROGRAMMING, 2017, 145 : 1 - 12
  • [50] Efficient locality-sensitive hashing over high-dimensional streaming data
    Hao Wang
    Chengcheng Yang
    Xiangliang Zhang
    Xin Gao
    Neural Computing and Applications, 2023, 35 : 3753 - 3766