A MapReduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction

被引:35
|
作者
Xia, Dawen [1 ,2 ]
Li, Huaqing [3 ]
Wang, Binfeng [1 ]
Li, Yantao [1 ]
Zhang, Zili [1 ,4 ]
机构
[1] Southwest Univ, Sch Comp & Informat Sci, Chongqing 400715, Peoples R China
[2] Guizhou Minzu Univ, Sch Informat Engn, Guiyang 550025, Peoples R China
[3] Southwest Univ, Sch Elect & Informat Engn, Chongqing 400715, Peoples R China
[4] Deakin Univ, Sch Informat Technol, Geelong, Vic 3220, Australia
来源
IEEE ACCESS | 2016年 / 4卷
基金
中国国家自然科学基金;
关键词
Big data analytics; traffic flow prediction; correlation analysis; parallel classifier; Hadoop MapReduce; TRAVEL-TIME PREDICTION; TRANSPORTATION; NETWORK; FREEWAY; SYSTEMS;
D O I
10.1109/ACCESS.2016.2570021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for traffic flow prediction using correlation analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system including two key modules, i.e., offline distributed training (ODT) and online parallel prediction (OPP). Moreover, we build a parallel k-nearest neighbor optimization classifier, which incorporates correlation information among traffic flows into the classification process. Finally, we propose a novel prediction calculation method, combining the current data observed in OPP and the classification results obtained from large-scale historical data in ODT, to generate traffic flow prediction in real time. The empirical study on real-world traffic flow big data using the leave-one-out cross validation method shows that TFPC significantly outperforms four state-of-the-art prediction approaches, i.e., autoregressive integrated moving average, Naive Bayes, multilayer perceptron neural networks, and NN regression, in terms of accuracy, which can be improved 90.07% in the best case, with an average mean absolute percent error of 5.53%. In addition, it displays excellent speedup, scaleup, and sizeup.
引用
收藏
页码:2920 / 2934
页数:15
相关论文
共 50 条
  • [31] A Radar-Nearest-Neighbor based data-driven approach for crowd simulation
    Zhao, Xuedan
    Zhang, Jun
    Song, Weiguo
    TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2021, 129
  • [32] Big-data-driven Anomaly Detection in Industry (4.0): an approach and a case study
    Stojanovic, Ljiljana
    Dinic, Marko
    Stojanovic, Nenad
    Stojadinovic, Aleksandar
    2016 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2016, : 1647 - 1652
  • [33] Traffic Incident Duration Prediction Based On K- Nearest Neighbor
    Wen, Yuan
    Chen, Shuyan
    Xiong, Qinyuan
    Han, Rubi
    Chen, Shiyu
    SUSTAINABLE DEVELOPMENT OF URBAN INFRASTRUCTURE, PTS 1-3, 2013, 253-255 : 1675 - 1681
  • [34] Scaling up MapReduce-based Big Data Processing on Multi-GPU systems
    Jiang, Hai
    Chen, Yi
    Qiao, Zhi
    Weng, Tien-Hsiung
    Li, Kuan-Ching
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2015, 18 (01): : 369 - 383
  • [35] Distributed Big Data Clustering using MapReduce-based Fuzzy C-Medoids
    Sardar T.H.
    Ansari Z.
    Journal of The Institution of Engineers (India): Series B, 2022, 103 (01) : 73 - 82
  • [36] MapReduce-Based Complex Big Data Analytics over Uncertain and Imprecise Social Networks
    Braun, Peter
    Cuzzocrea, Alfredo
    Jiang, Fan
    Leung, Carson Kai-Sang
    Pazdor, Adam G. M.
    BIG DATA ANALYTICS AND KNOWLEDGE DISCOVERY, DAWAK 2017, 2017, 10440 : 130 - 145
  • [37] Spatial-Temporal K Nearest Neighbors Model on MapReduce for Traffic Flow Prediction
    Agafonov, Anton
    Yumaganov, Alexander
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2018, PT I, 2018, 11314 : 253 - 260
  • [38] MapReduce-Based D_ELT Framework to Address the Challenges of Geospatial Big Data
    Jo, Junghee
    Lee, Kang-Woo
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2019, 8 (11)
  • [39] Big Data-Driven Based Real-Time Traffic Flow State Identification and Prediction
    Lu, Hua-pu
    Sun, Zhi-yuan
    Qu, Wen-cong
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2015, 2015
  • [40] MapReduce-based parallel GEP algorithm for efficient function mining in big data applications
    Liu, Yang
    Ma, Chenxiao
    Xu, Lixiong
    Shen, Xiaodong
    Li, Maozhen
    Li, Pengcheng
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2018, 30 (23):