Performance Evaluation of Spatial Data Management Systems Using GeoSpark

被引:3
|
作者
Shin, Hansub [1 ]
Lee, Kisung [2 ]
Kwon, Hyuk-Yoon [1 ]
机构
[1] Seoul Natl Univ Sci & Technol, Dept Ind & Syst Engn, Seoul, South Korea
[2] Louisiana State Univ, Div Comp Sci & Engn, Baton Rouge, LA 70803 USA
基金
新加坡国家研究基金会;
关键词
Large-scale spatial data; GeoSpark; Performance evaluation; Distributed environments; BIG DATA;
D O I
10.1109/BigComp48618.2020.00-75
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we evaluate the performance of spatial data management systems in distributed computing environments. Given that GeoSpark outperforms other spatial systems in many scenarios as reported in several studies, we choose spatial data management systems using GeoSpark for this evaluation. Even though GeoSpark supports various storage engines as its underlying data store, the effects of the storage engines for spatial data processing have not been well studied. To address this limitation, we evaluate the performance of GeoSpark using two underlying data stores: 1) HDFS and 2) MongoDB. We first design and build distributed experimental environments based on Amazon EC2 and EMR using up to 10 nodes. Through the extensive experiments on three synthetic and real data sets, we show that the overall performance of both HDFS- and MongoDB-based GeoSpark improves as we increase the number of nodes. We also show that HDFS-based GeoSpark generally outperforms MongoDB-based GeoSpark, especially for large-scale data sets. In addition, we demonstrate that the proper use of caching on HDFS-based GeoSpark can improve the overall query processing performance by up to three orders of magnitude.
引用
收藏
页码:197 / 200
页数:4
相关论文
共 50 条
  • [41] Evaluation of the performance of tests for spatial randomness on prostate cancer data
    Hinrichsen, Virginia L.
    Klassen, Ann C.
    Song, Changhong
    Kulldorff, Martin
    INTERNATIONAL JOURNAL OF HEALTH GEOGRAPHICS, 2009, 8
  • [42] Evaluation of the performance of tests for spatial randomness on prostate cancer data
    Virginia L Hinrichsen
    Ann C Klassen
    Changhong Song
    Martin Kulldorff
    International Journal of Health Geographics, 8
  • [43] BER Performance Evaluation of HF MIMO Spatial Multiplexing Systems
    Oghre, O. C.
    Salous, S.
    2014 XXXITH URSI GENERAL ASSEMBLY AND SCIENTIFIC SYMPOSIUM (URSI GASS), 2014,
  • [44] Performance Evaluation of Micro- and Minidistributed Photovoltaic Systems Using Data Envelopment Analysis
    Cavalcanti, Alvaro de Araujo
    dos Santos Neves, Francisco de Assis
    de Souza Azevedo, Gustavo Medeiros
    de Almeida Filho, Adiel Teixeira
    IEEE JOURNAL OF PHOTOVOLTAICS, 2019, 9 (06): : 1806 - 1814
  • [45] A Performance Evaluation of Spatial Indices for Geospatial Publish/Subscribe Systems
    Pripuzic, Kresimir
    Katusic, Damjan
    Marjanovic, Martina
    Antonic, Aleksandar
    Livaja, Ivan
    2019 15TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS (CONTEL), 2019,
  • [46] Data aggregation for evaluation of the performance of flexible manufacturing systems using queuing network models
    de Almeida, D
    RAIRO-RECHERCHE OPERATIONNELLE-OPERATIONS RESEARCH, 1998, 32 (02): : 145 - 192
  • [47] Big Data Management Performance Evaluation in Hadoop Ecosystem
    Liu, Qing
    Fu, Yinjin
    Ni, Guiqiang
    Mei, Jianmin
    2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM), 2017, : 413 - 421
  • [48] RFID data management: Algorithms, protocols and performance evaluation
    Xie, Lei
    Yin, Ya-Feng
    Chen, Xi
    Lu, Sang-Lu
    Chen, Dao-Xu
    Jisuanji Xuebao/Chinese Journal of Computers, 2013, 36 (03): : 457 - 470
  • [49] Centralization and Evaluation of Measurement Data for Water Management Systems
    Maeder, Karina
    WASSERWIRTSCHAFT, 2012, 102 (09) : 10 - 14
  • [50] Performance measures and data requirements for congestion management systems
    Quiroga, CA
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2000, 8 (1-6) : 287 - 306