GeMDA: A multidimensional data partitioning technique for multiprocessor database systems

被引:7
|
作者
Lo, YL
Hua, KA
Young, HC
机构
[1] Chaoyang Univ Technol, Dept Informat Management, Wufeng 413, Taichung County, Taiwan
[2] Univ Cent Florida, Sch Elect Engn & Comp Sci, Orlando, FL 32816 USA
[3] IBM Corp, Almaden Res Ctr, Div Res, San Jose, CA 95120 USA
关键词
data allocation; data fragmentation; parallel database system; query processing; system utilization;
D O I
10.1023/A:1019265612794
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Several studies have repeatedly demonstrated that both the performance and scalability of a shared nothing parallel database system depend on the physical layout of data across the processing nodes of the system. Today, data is allocated in these systems using horizontal partitioning strategies. This approach has a number of drawbacks. If a query involves the partitioning attribute, then typically only a small number of the processing nodes can be used to speedup the execution of this query. On the other hand, if the predicate of a selection query includes an attribute other than the partitioning attribute, then the entire data space must be searched. Again, this results in waste of computing resources. In recent years, several multidimensional data declustering techniques have been proposed to address these problems. However, these schemes are too restrictive (e.g., FX, ECC, etc.), or optimized for a certain type of queries (e.g., DM, HCAM, etc.). In this paper, we introduce a new technique which is flexible, and performs well for general queries, We prove its optimality properties, and present experimental results showing that our scheme outperforms DM and HCAM by a significant margin.
引用
收藏
页码:211 / 236
页数:26
相关论文
共 50 条