Building a Large-Scale Microscopic Road Network Traffic Simulator in Apache Spark

被引:3
|
作者
Fu, Zishan [1 ]
Yu, Jia [1 ]
Sarwat, Mohamed [1 ]
机构
[1] Arizona State Univ, Dept Comp Sci, Tempe, AZ 85287 USA
来源
2019 20TH INTERNATIONAL CONFERENCE ON MOBILE DATA MANAGEMENT (MDM 2019) | 2019年
基金
美国国家科学基金会;
关键词
Spatio-temporal Data; Apache Spark; Traffic Model; Microscopic Traffic Simulation; MODEL;
D O I
10.1109/MDM.2019.00-42
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Road network traffic data has been widely studied by researchers and practitioners in different areas such as urban planning, traffic prediction and spatial-temporal databases. For instance, researchers use such data to evaluate the impact of road network changes. Unfortunately, collecting large-scale high-quality urban traffic data requires tremendous efforts because participating vehicles must install GPS receivers and administrators must continuously monitor these devices. There has been a number of urban traffic simulators trying to generate such data with different features. However, they suffer from two critical issues (1) scalability: most of them only offer single-machine solution which is not adequate to produce large-scale data. Some simulators can generate traffic in parallel but do not well balance the load among machines in a cluster. (2) granularity: many simulators do not consider microscopic traffic situations including traffic lights, lane changing, car following. In the paper, we propose GeoSparkSim, a scalable traffic simulator which extends Apache Spark to generate large-scale road network traffic datasets with microscopic traffic simulation. The proposed system seamlessly integrates with a Spark-based spatial data management system, GeoSpark, to deliver a holistic approach that allows data scientists to simulate, analyze and visualize large-scale urban traffic data. To implement microscopic traffic models, GeoSparkSim employs a simulation-aware vehicle partitioning method to partition vehicles among different machines such that each machine has a balanced workload. The experimental analysis shows that GeoSparkSim can simulate the movements of 200 thousand vehicles over a very large road network (250 thousand road junctions and 300 thousand road segments).
引用
收藏
页码:320 / 328
页数:9
相关论文
共 50 条
  • [1] Dissecting GeoSparkSim: a scalable microscopic road network traffic simulator in Apache Spark
    Jia Yu
    Zishan Fu
    Mohamed Sarwat
    Distributed and Parallel Databases, 2020, 38 : 963 - 994
  • [2] Dissecting GeoSparkSim: a scalable microscopic road network traffic simulator in Apache Spark
    Yu, Jia
    Fu, Zishan
    Sarwat, Mohamed
    DISTRIBUTED AND PARALLEL DATABASES, 2020, 38 (04) : 963 - 994
  • [3] Large-Scale Network Embedding in Apache Spark
    Lin, Wenqing
    KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3271 - 3279
  • [4] Demonstrating GeoSparkSim: A Scalable Microscopic Road Network Tra.ic Simulator Based on Apache Spark
    Fu, Zishan
    Yu, Jia
    Sarwat, Mohamed
    SSTD '19 - PROCEEDINGS OF THE 16TH INTERNATIONAL SYMPOSIUM ON SPATIAL AND TEMPORAL DATABASES, 2019, : 186 - 189
  • [5] Large-Scale Data Pollution with Apache Spark
    Hildebrandt, Kai
    Panse, Fabian
    Wilcke, Niklas
    Ritter, Norbert
    IEEE TRANSACTIONS ON BIG DATA, 2020, 6 (02) : 396 - 411
  • [6] Processing large-scale data with Apache Spark
    Ko, Seyoon
    Won, Joong-Ho
    KOREAN JOURNAL OF APPLIED STATISTICS, 2016, 29 (06) : 1077 - 1094
  • [7] Large-scale text processing pipeline with Apache Spark
    Svyatkovskiy, A.
    Imai, K.
    Kroeger, M.
    Shiraito, Y.
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 3928 - 3935
  • [8] GeoMatch: Efficient Large-Scale Map Matching on Apache Spark
    Zeidan, Ayman
    Lagerspetz, Eemil
    Zhao, Kai
    Nurmi, Petteri
    Tarkoma, Sasu
    Vo, Huy T.
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 384 - 391
  • [9] Filter Large-scale Engine Data using Apache Spark
    Pirozzi, Donato
    Scarano, Vittorio
    Begg, Steven
    De Sercey, Guillaume
    Fish, Andrew
    Harvey, Andrew
    2016 IEEE 14TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2016, : 1300 - 1305
  • [10] Particle Swarm Optimization for Large-Scale Clustering on Apache Spark
    Sherar, Matthew
    Zulkernine, Farhana
    2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2017, : 801 - 808