A MapReduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction

被引:35
|
作者
Xia, Dawen [1 ,2 ]
Li, Huaqing [3 ]
Wang, Binfeng [1 ]
Li, Yantao [1 ]
Zhang, Zili [1 ,4 ]
机构
[1] Southwest Univ, Sch Comp & Informat Sci, Chongqing 400715, Peoples R China
[2] Guizhou Minzu Univ, Sch Informat Engn, Guiyang 550025, Peoples R China
[3] Southwest Univ, Sch Elect & Informat Engn, Chongqing 400715, Peoples R China
[4] Deakin Univ, Sch Informat Technol, Geelong, Vic 3220, Australia
来源
IEEE ACCESS | 2016年 / 4卷
基金
中国国家自然科学基金;
关键词
Big data analytics; traffic flow prediction; correlation analysis; parallel classifier; Hadoop MapReduce; TRAVEL-TIME PREDICTION; TRANSPORTATION; NETWORK; FREEWAY; SYSTEMS;
D O I
10.1109/ACCESS.2016.2570021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for traffic flow prediction using correlation analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system including two key modules, i.e., offline distributed training (ODT) and online parallel prediction (OPP). Moreover, we build a parallel k-nearest neighbor optimization classifier, which incorporates correlation information among traffic flows into the classification process. Finally, we propose a novel prediction calculation method, combining the current data observed in OPP and the classification results obtained from large-scale historical data in ODT, to generate traffic flow prediction in real time. The empirical study on real-world traffic flow big data using the leave-one-out cross validation method shows that TFPC significantly outperforms four state-of-the-art prediction approaches, i.e., autoregressive integrated moving average, Naive Bayes, multilayer perceptron neural networks, and NN regression, in terms of accuracy, which can be improved 90.07% in the best case, with an average mean absolute percent error of 5.53%. In addition, it displays excellent speedup, scaleup, and sizeup.
引用
收藏
页码:2920 / 2934
页数:15
相关论文
共 50 条
  • [1] A MapReduce-based k-Nearest Neighbor Approach for Big Data Classification
    Maillo, Jesus
    Triguero, Isaac
    Herrera, Francisco
    2015 IEEE TRUSTCOM/BIGDATASE/ISPA, VOL 2, 2015, : 167 - 172
  • [2] A MapReduce-Based Approach for Continuous K-Nearest Neighbor Search in Road Networks
    Ferchichi, Hafedh
    Akaichi, Jalel
    INNOVATION MANAGEMENT AND EDUCATION EXCELLENCE VISION 2020: FROM REGIONAL DEVELOPMENT SUSTAINABILITY TO GLOBAL ECONOMIC GROWTH, VOLS I - VI, 2016, : 2988 - 3002
  • [3] The MapReduce-based approach to improve vehicle controls on big traffic events
    Hamilton Adoni, Wilfried Yves
    Nahhal, Tarik
    Aghezzaf, Brahim
    Elbyed, Abdeltif
    2017 INTERNATIONAL COLLOQUIUM ON LOGISTICS AND SUPPLY CHAIN MANAGEMENT (LOGISTIQUA), 2017, : 1 - 6
  • [4] A MapReduce-based approach to social network big data mining
    Qi, Fuli
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2023, 23 (05) : 2535 - 2547
  • [5] TRAFFIC FLOW PROPHECY WITH MAPREDUCE JOB FOR BIG DATA DRIVEN
    Abirami, U.
    Sridevi, S.
    2016 EIGHTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (ICOAC), 2017, : 13 - 18
  • [6] A MapReduce-based Approach to Scale Big Semantic Data Compression with HDT
    Gimenez, J. M.
    Fernandez, J. D.
    Martinez, M. A.
    IEEE LATIN AMERICA TRANSACTIONS, 2017, 15 (07) : 1270 - 1277
  • [7] A MapReduce-Based ELM for Regression in Big Data
    Wu, B.
    Yan, T. H.
    Xu, X. S.
    He, B.
    Li, W. H.
    INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING - IDEAL 2016, 2016, 9937 : 164 - 173
  • [8] Atrak: a MapReduce-based data warehouse for big data
    Barkhordari, Mohammadhossein
    Niamanesh, Mahdi
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (10): : 4596 - 4610
  • [9] Atrak: a MapReduce-based data warehouse for big data
    Mohammadhossein Barkhordari
    Mahdi Niamanesh
    The Journal of Supercomputing, 2017, 73 : 4596 - 4610
  • [10] Analysis of microarray leukemia data using an efficient MapReduce-based K-nearest-neighbor classifier
    Kumar, Mukesh
    Rath, Nitish Kumar
    Rath, Santanu Kumar
    JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 60 : 395 - 409