LayerLSH: Rebuilding Locality-Sensitive Hashing Indices by Exploring Density of Hash Values

被引:1
|
作者
Ding, Jiwen [1 ]
Liu, Zhuojin [1 ]
Zhang, Yanfeng [1 ]
Gong, Shufeng [1 ]
Yu, Ge [1 ]
机构
[1] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110819, Peoples R China
基金
中国国家自然科学基金;
关键词
Costs; Hash functions; Search problems; Nearest neighbor methods; Indexing; Compounds; Licenses; LSH; nearest neighbors search; multi-layered structure; data skewness; LSH; FRAMEWORK; SEARCH;
D O I
10.1109/ACCESS.2022.3182802
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Locality-sensitive hashing (LSH) has attracted extensive research efforts for approximate nearest neighbors (NN) search. However, most of these LSH-based index structures fail to take data distribution into account. They perform well in a uniform data distribution setting but exhibit unstable performance when the data are skewed. As known, most real life data are skewed, which makes LSH suffer. In this paper, we observe that the skewness of hash values resulted from skewed data is a potential reason for performance degradation. To address this problem, we propose to rebuild LSH indices by exploring the density of hash values. The hash values in dense/sparse ranges are carefully reorganized using a multi-layered structure, so that more efforts are put into indexing the dense hash values. We further discuss the benefit in distributed computing. Extensive experiments are conducted to show the effectiveness and efficiency of the reconstructed LSH indices.
引用
收藏
页码:69851 / 69865
页数:15
相关论文
共 50 条
  • [22] Digital Watermarks for Videos Based on a Locality-Sensitive Hashing Algorithm
    Sun, Yajuan
    Srivastava, Gautam
    MOBILE NETWORKS & APPLICATIONS, 2023, 28 (05): : 1724 - 1737
  • [23] Frequent-Itemset Mining Using Locality-Sensitive Hashing
    Bera, Debajyoti
    Pratap, Rameshwar
    COMPUTING AND COMBINATORICS, COCOON 2016, 2016, 9797 : 143 - 155
  • [24] Fast Access for Star Catalog Based on Locality-Sensitive Hashing
    Zhu H.
    Liang B.
    Zhang T.
    Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2018, 36 (05): : 988 - 994
  • [25] Locality-Sensitive Hashing for Finding Nearest Neighbors in Probability Distributions
    Tang, Yi-Kun
    Mao, Xian-Ling
    Hao, Yi-Jing
    Xu, Cheng
    Huang, Heyan
    SOCIAL MEDIA PROCESSING, SMP 2017, 2017, 774 : 3 - 15
  • [26] On the Problem of p1-1 in Locality-Sensitive Hashing
    Ahle, Thomas Dybdahl
    SIMILARITY SEARCH AND APPLICATIONS, SISAP 2020, 2020, 12440 : 85 - 93
  • [27] An improved method of locality-sensitive hashing for scalable instance matching
    Mehmet Aydar
    Serkan Ayvaz
    Knowledge and Information Systems, 2019, 58 : 275 - 294
  • [28] Fast hierarchical clustering algorithm using locality-sensitive hashing
    Koga, H
    Ishibashi, T
    Watanabe, T
    DISCOVERY SCIENCE, PROCEEDINGS, 2004, 3245 : 114 - 128
  • [29] A Scalable ECG Identification System Based on Locality-Sensitive Hashing
    Chu, Hui-Yu
    Lin, Tzu-Yun
    Lee, Song-Hong
    Chiu, Jui-Kun
    Nien, Cing-Ping
    Wu, Shun-Chi
    2023 45TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY, EMBC, 2023,
  • [30] Similar Pair Identification using Locality-Sensitive Hashing Technique
    Lee, Kyung Mi
    Lee, Keon Myung
    6TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS, AND THE 13TH INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS, 2012, : 2117 - 2119