Fast similarity join for multi-dimensional data

被引:13
|
作者
Kalashnikov, Dmitri V.
Prabhakar, Sunil
机构
[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
[2] Purdue Univ, Dept Comp Sci, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
similarity join; grid-based joins;
D O I
10.1016/j.is.2005.07.002
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The efficient processing of multidimensional similarity joins is important for a large class of applications. The dimensionality of the data for these applications ranges from low to high. Most existing methods have focused on the execution of high-dimensional joins over large amounts of disk-based data. The increasing sizes of main memory available on current computers, and the need for efficient processing of spatial joins suggest that spatial joins for a large class of problems can be processed in main memory. In this paper, we develop two new in-memory spatial join algorithms, the Grid-join and EGO*-join, and study their performance. Through evaluation, we explore the domain of applicability of each approach and provide recommendations for the choice of a join algorithm depending upon the dimensionality of the data as well as the expected selectivity of the join. We show that the two new proposed join techniques substantially outperform the state-of-the-art join algorithm, the EGO-join. (C) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:160 / 177
页数:18
相关论文
共 50 条
  • [21] EFFICIENT SIMILARITY SEARCH FOR MULTI-DIMENSIONAL TIME SEQUENCES
    Lee, Sangjun
    Park, Jisook
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2010, 8 (03) : 343 - 357
  • [23] Similarity solutions for a multi-dimensional replicator dynamics equation
    Papanicolaou, Vassilis G.
    Smyrlis, George
    NONLINEAR ANALYSIS-THEORY METHODS & APPLICATIONS, 2009, 71 (7-8) : 3185 - 3196
  • [24] Similarity-Based Segmentation of Multi-Dimensional Signals
    Rainer Machné
    Douglas B. Murray
    Peter F. Stadler
    Scientific Reports, 7
  • [25] Similarity join for low- and high- dimensional data
    Kalashnikov, DV
    Prabhakar, S
    EIGHTH INTERNATIONAL CONFERENCE ON DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2003, : 7 - 16
  • [26] Fast and Energy Efficient Data Storage for Information Discovery in Multi-Dimensional WSNs
    Tissera, Menik
    Doss, Robin
    Li, Gang
    Batten, Lynn
    25TH INTERNATIONAL TELECOMMUNICATION NETWORKS AND APPLICATIONS CONFERENCE (ITNAC 2015), 2015, : 88 - 93
  • [27] Multi-dimensional fast rule filter automata
    Fuchssteiner, B
    Kemper, A
    PHYSICA D, 1999, 129 (1-2): : 130 - 142
  • [28] Fast multi-dimensional NMR by minimal sampling
    Kupce, Eriks
    Freeman, Ray
    JOURNAL OF MAGNETIC RESONANCE, 2008, 191 (01) : 164 - 168
  • [29] Fast Multi-dimensional Polar Encoding and Decoding
    Mahdavifar, Hessam
    El-Khamy, Mostafa
    Lee, Jungwon
    Kang, Inyup
    2014 INFORMATION THEORY AND APPLICATIONS WORKSHOP (ITA), 2014, : 209 - 213
  • [30] Efficient Similarity Join and Search on Multi-Attribute Data
    Li, Guoliang
    He, Jian
    Deng, Dong
    Li, Jian
    SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 1137 - 1151