A MapReduce-Based Nearest Neighbor Approach for Big-Data-Driven Traffic Flow Prediction

被引:35
|
作者
Xia, Dawen [1 ,2 ]
Li, Huaqing [3 ]
Wang, Binfeng [1 ]
Li, Yantao [1 ]
Zhang, Zili [1 ,4 ]
机构
[1] Southwest Univ, Sch Comp & Informat Sci, Chongqing 400715, Peoples R China
[2] Guizhou Minzu Univ, Sch Informat Engn, Guiyang 550025, Peoples R China
[3] Southwest Univ, Sch Elect & Informat Engn, Chongqing 400715, Peoples R China
[4] Deakin Univ, Sch Informat Technol, Geelong, Vic 3220, Australia
来源
IEEE ACCESS | 2016年 / 4卷
基金
中国国家自然科学基金;
关键词
Big data analytics; traffic flow prediction; correlation analysis; parallel classifier; Hadoop MapReduce; TRAVEL-TIME PREDICTION; TRANSPORTATION; NETWORK; FREEWAY; SYSTEMS;
D O I
10.1109/ACCESS.2016.2570021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In big-data-driven traffic flow prediction systems, the robustness of prediction performance depends on accuracy and timeliness. This paper presents a new MapReduce-based nearest neighbor (NN) approach for traffic flow prediction using correlation analysis (TFPC) on a Hadoop platform. In particular, we develop a real-time prediction system including two key modules, i.e., offline distributed training (ODT) and online parallel prediction (OPP). Moreover, we build a parallel k-nearest neighbor optimization classifier, which incorporates correlation information among traffic flows into the classification process. Finally, we propose a novel prediction calculation method, combining the current data observed in OPP and the classification results obtained from large-scale historical data in ODT, to generate traffic flow prediction in real time. The empirical study on real-world traffic flow big data using the leave-one-out cross validation method shows that TFPC significantly outperforms four state-of-the-art prediction approaches, i.e., autoregressive integrated moving average, Naive Bayes, multilayer perceptron neural networks, and NN regression, in terms of accuracy, which can be improved 90.07% in the best case, with an average mean absolute percent error of 5.53%. In addition, it displays excellent speedup, scaleup, and sizeup.
引用
收藏
页码:2920 / 2934
页数:15
相关论文
共 50 条
  • [21] Technological Surveillance in Big Data Environments by using a MapReduce-based Method
    Pascal Filho, Daniel San Martin
    Jeronimo de Macedo, Douglas Dyllon
    Dutra, Moises Lima
    MOBILE NETWORKS & APPLICATIONS, 2022, 27 (05): : 1931 - 1940
  • [22] Uncertainty prediction method for traffic flow based on K-nearest neighbor algorithm
    Yang, Lingmin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (02) : 1489 - 1499
  • [23] A Tensor-Based Big-Data-Driven Routing Recommendation Approach for Heterogeneous Networks
    Wang, Xiaokang
    Yang, Laurence T.
    Kuang, Liwei
    Liu, Xingang
    Zhang, Qingxia
    Deen, M. Jamal
    IEEE NETWORK, 2019, 33 (01): : 64 - 69
  • [24] Traffic Flow Prediction Based on the location of Big Data
    Zhang, Xijun
    Yuan, Zhanting
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON CIVIL ENGINEERING AND TRANSPORTATION 2015, 2016, 30 : 1221 - 1225
  • [25] Traffic Flow Prediction With Big Data: A Deep Learning Approach
    Lv, Yisheng
    Duan, Yanjie
    Kang, Wenwen
    Li, Zhengxi
    Wang, Fei-Yue
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2015, 16 (02) : 865 - 873
  • [26] Gold Price Forecasting: A Novel Approach Based on Text Mining and Big-Data-Driven Model
    Kian Poor, Saeed
    Fattahi, Shahram
    Hajian, Mohsen
    STUDIES IN BUSINESS AND ECONOMICS, 2024, 19 (03) : 156 - 171
  • [27] Research on MapReduce-based fuzzy associative classifier for big probabilistic numerical data
    Pei, Bin
    Wang, Fenmei
    Wang, Xiuzhen
    2016 IEEE INTERNATIONAL CONFERENCE ON INTERNET OF THINGS (ITHINGS) AND IEEE GREEN COMPUTING AND COMMUNICATIONS (GREENCOM) AND IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING (CPSCOM) AND IEEE SMART DATA (SMARTDATA), 2016, : 903 - 906
  • [28] Knowledge process of health big data using MapReduce-based associative mining
    Choi, So-Young
    Chung, Kyungyong
    PERSONAL AND UBIQUITOUS COMPUTING, 2020, 24 (05) : 571 - 581
  • [29] Knowledge process of health big data using MapReduce-based associative mining
    So-Young Choi
    Kyungyong Chung
    Personal and Ubiquitous Computing, 2020, 24 : 571 - 581
  • [30] Research on Big Data-Driven Urban Traffic Flow Prediction Based on Deep Learning
    Qin, Xiaoan
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGIES AND SYSTEMS APPROACH, 2023, 16 (01)