A Comparison of Distributed Spatial Data Management Systems for Processing Distance Join Queries

被引:6
|
作者
Garcia-Garcia, Francisco [1 ]
Corral, Antonio [1 ]
Iribarne, Luis [1 ]
Mavrommatis, George [2 ]
Vassilakopoulos, Michael [2 ]
机构
[1] Univ Almeria, Dept Informat, Almeria, Spain
[2] Univ Thessaly, DaSE Lab, Dept Elect & Comp Engn, Volos, Greece
关键词
Spatial data processing; Distance joins; SpatialHadoop; LocationSpark; ALGORITHMS;
D O I
10.1007/978-3-319-66917-5_15
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the ubiquitous use of spatial data applications and the large amounts of spatial data that these applications generate, the processing of large-scale distance joins in distributed systems is becoming increasingly popular. Two of the most studied distance join queries are the K Closest Pair Query (KCPQ) and the e Distance Join Query (eDJQ). The KCPQ finds the K closest pairs of points from two datasets and the eDJQ finds all the possible pairs of points from two datasets, that are within a distance threshold e of each other. Distributed cluster-based computing systems can be classified in Hadoop-based and Spark-based systems. Based on this classification, in this paper, we compare two of the most current and leading distributed spatial data management systems, namely SpatialHadoop and LocationSpark, by evaluating the performance of existing and newly proposed parallel and distributed distance join query algorithms in different situations with big real-world datasets. As a general conclusion, while SpatialHadoop is more mature and robust system, LocationSpark is the winner with respect to the total execution time.
引用
收藏
页码:214 / 228
页数:15
相关论文
共 50 条
  • [1] Efficient distance join query processing in distributed spatial data management systems
    Garcia-Garcia, Francisco
    Corral, Antonio
    Iribarne, Luis
    Vassilakopoulos, Michael
    Manolopoulos, Yannis
    [J]. INFORMATION SCIENCES, 2020, 512 : 985 - 1008
  • [2] Distributed approach of continuous queries with KNN join processing in spatial data warehouse
    Gorawski, Marcin
    Gebczyk, Wojciech
    [J]. ICEIS 2007: PROCEEDINGS OF THE NINTH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS: DATABASES AND INFORMATION SYSTEMS INTEGRATION, 2007, : 131 - 136
  • [3] Efficient distributed algorithms for distance join queries in spark-based spatial analytics systems
    Garcia-Garcia, Francisco
    Corral, Antonio
    Iribarne, Luis
    Vassilakopoulos, Michael
    [J]. INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 2023, 52 (03) : 206 - 250
  • [4] Efficient Parallel Processing of Distance Join Queries Over Distributed Graphs
    Zhang, Xiaofei
    Chen, Lei
    Wang, Min
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (03) : 740 - 754
  • [5] Processing distance join queries with constraints
    Papadopoulos, Apostolos N.
    Nanopoulos, Alexandros
    Manolopoulos, Yannis
    [J]. Computer Journal, 2006, 49 (03): : 281 - 296
  • [6] Processing distance join queries with constraints
    Papadopoulos, AN
    Nanopoulos, A
    Manolopoulos, Y
    [J]. COMPUTER JOURNAL, 2006, 49 (03): : 281 - 296
  • [7] Adaptive and incremental processing for distance join queries
    Shin, H
    Moon, B
    Lee, S
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (06) : 1561 - 1578
  • [8] Distance join queries of multiple inputs in spatial databases
    Corral, A
    Manolopoulos, Y
    Theodoridis, Y
    Vassilakopoulos, M
    [J]. ADVANCES IN DATABASES AND INFORMATION SYSTEMS, PROCEEDINGS, 2003, 2798 : 323 - 338
  • [9] A Taxonomy for Distance-Based Spatial Join Queries
    Li, Lingxiao
    Taniar, David
    [J]. INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2017, 13 (03) : 1 - 24
  • [10] Realization of continuous queries with NN join processing in spatial telemetric data warehouse
    Gorawski, Marcin
    Gebczyk, Wojciech
    [J]. SEVENTEENTH INTERNATIONAL CONFERENCE ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2006, : 632 - +