Fast similarity join for multi-dimensional data

被引:13
|
作者
Kalashnikov, Dmitri V.
Prabhakar, Sunil
机构
[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
similarity join; grid-based joins;
D O I
10.1016/j.is.2005.07.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The efficient processing of multidimensional similarity joins is important for a large class of applications. The dimensionality of the data for these applications ranges from low to high. Most existing methods have focused on the execution of high-dimensional joins over large amounts of disk-based data. The increasing sizes of main memory available on current computers, and the need for efficient processing of spatial joins suggest that spatial joins for a large class of problems can be processed in main memory. In this paper, we develop two new in-memory spatial join algorithms, the Grid-join and EGO*-join, and study their performance. Through evaluation, we explore the domain of applicability of each approach and provide recommendations for the choice of a join algorithm depending upon the dimensionality of the data as well as the expected selectivity of the join. We show that the two new proposed join techniques substantially outperform the state-of-the-art join algorithm, the EGO-join. (C) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:160 / 177
页数:18
相关论文
共 50 条
  • [41] A Δ-tree based similarity join processing for high-dimensional data
    Liu, Yan
    Hao, Zhongxiao
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2009, 46 (06): : 995 - 1002
  • [42] Fast Packet Classification Using Multi-Dimensional Encoding
    Huang, Chi Jia
    Chen, Chien
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2009, E92B (06) : 2044 - 2053
  • [43] Fast algorithms for the multi-dimensional Jacobi polynomial transform
    Bremer, James
    Pang, Qiyuan
    Yang, Haizhao
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2021, 52 (52) : 231 - 250
  • [44] Fast Conflict Detection for Multi-Dimensional Packet Filters
    Lee, Chun-Liang
    Lin, Guan-Yu
    Chen, Yaw-Chung
    ALGORITHMS, 2022, 15 (08)
  • [45] K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
    Su, Bing
    Huang, Jiaxuan
    Miao, Kelei
    Wang, Zhangquan
    Zhang, Xudong
    Chen, Yourong
    SENSORS, 2023, 23 (03)
  • [46] Clustering for multi-dimensional data and its visualization
    Ren, Y.-G. (renyg@dl.cn), 1861, Science Press (28):
  • [47] Efficient quantile retrieval on multi-dimensional data
    Yiu, Man Lung
    Mamoulis, Nikos
    Tao, Yufei
    ADVANCES IN DATABASE TECHNOLOGY - EDBT 2006, 2006, 3896 : 167 - 185
  • [48] Detecting clusters and Outliers for multi-dimensional data
    Shi, Yong
    MUE: 2008 INTERNATIONAL CONFERENCE ON MULTIMEDIA AND UBIQUITOUS ENGINEERING, PROCEEDINGS, 2008, : 429 - 432
  • [49] MODELING NONLINEARITY IN MULTI-DIMENSIONAL DEPENDENT DATA
    Han, Qiuyi
    Ding, Jie
    Airoldi, Edoardo
    Tarokh, Vahid
    2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, : 206 - 210
  • [50] Scanning and prediction in multi-dimensional data arrays
    Merhav, N
    Weissman, T
    ISIT: 2002 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, PROCEEDINGS, 2002, : 317 - 317