R-Grove: Growing a Family of R-trees in the Big-Data Forest

被引:12
|
作者
Vu, Tin [1 ]
Eldawy, Ahmed [1 ]
机构
[1] Univ Calif Riverside, Comp Sci & Engn, Riverside, CA 92521 USA
基金
美国国家科学基金会;
关键词
Spatial Partitioning; Big Data; Indexing;
D O I
10.1145/3274895.3274984
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The rapid growth of big spatial data urged the research community to develop several big spatial data systems. Regardless of their architecture, one of the fundamental requirements of all these systems is to partition the data efficiently across machines. A widely-used technique for big spatial indexing is to reuse existing search trees as-is, e.g., the R-tree family, by building a temporary tree for a sample of the input and use its leaf nodes as partition boundaries. However, we show in this paper that this approach has major limitations that make it unsuitable for the big data environment. This paper studies the use of three popular trees from the R-tree family to index big spatial data, namely, the original R-tree by Guttman, R*-tree, and RR*-tree. We show that the entire family of R-trees is not ready to grow in the big data forest due to fundamental limitations in their design. To overcome these limitations, we propose three new indexes, namely, R-Grove, R*-Grove, and RR*-Grove, which are fundamentally modified to work with big data while inheriting the main characteristics of their traditional index counterparts. With all the proposed indexes publicly available as open source, we hope that these new indexes will be adopted by the community to better serve big spatial data research.
引用
收藏
页码:532 / 535
页数:4
相关论文
共 50 条