A spatiotemporal approach for traffic data imputation with complicated missing patterns

被引:37
|
作者
Li, Huiping [1 ]
Li, Meng [1 ,2 ]
Lin, Xi [1 ,2 ]
He, Fang [2 ,3 ]
Wang, Yinhai [1 ,4 ]
机构
[1] Tsinghua Univ, Dept Civil Engn, Beijing 100084, Peoples R China
[2] Tsinghua Univ, Tsinghua Daimler Joint Res Ctr Sustainable Transp, Beijing 100084, Peoples R China
[3] Tsinghua Univ, Dept Ind Engn, Beijing 100084, Peoples R China
[4] Univ Washington, Dept Civil & Environm Engn, Seattle, WA 98195 USA
基金
中国国家自然科学基金;
关键词
Completely missing; Iterative random forest; Prophet; Time series; Residual; FUZZY C-MEANS; TENSOR; VALUES; FLOW;
D O I
10.1016/j.trc.2020.102730
中图分类号
U [交通运输];
学科分类号
08 ; 0823 ;
摘要
With the advent of intelligent transportation systems (ITS), spatiotemporal traffic data has gained growing importance in real-time monitoring, prediction, and control of traffic. However, in practical implementations, data collection devices are often faced with malfunctions caused by various unpredictable disruptions, thereby resulting in the so-called "missing value problems." In realistic cases, the disruptions to the data collection devices are often associated with some key events (e.g., power cut and natural disasters), in addition, along with other disruptions the missing value problem could be in a complicated manner with both randomly and completely missing patterns. To perform the imputation task with such complicated missing patterns, we propose a hybrid spatiotemporal method which utilizes the time series properties by "prophet" model and captures the spatial residuals information by iterative random forest model. The spatiotemporal method first applies the temporal part to fill the missing value and then adopts the spatial part to acquire the residual component of the missing values. The results of the two components are integrated into the final imputations. Based on the PeMS freeway dataset (PeMS, 2019) and an urban road dataset under extensive artificially designed scenarios like randomly, clustered non-completely and completely missing patterns, we test our proposed approach with some existing techniques such as K-Nearest Neighbor (KNN), Seasonal-Trend decomposition using Loess (STL), Bayesian tensor decomposition, Denoising AutoEncoder (DAE). The test results indicate that the hybrid method achieves the best imputation quality for most missing patterns, particularly for those with completely or hybrid missing patterns. Furthermore, the hybrid model still performs well under extreme missing rates as high as 0.9, which validates the robustness of the model in extreme situations.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Multiple imputation: a mature approach to dealing with missing data
    Chevret, S.
    Seaman, S.
    Resche-Rigon, M.
    [J]. INTENSIVE CARE MEDICINE, 2015, 41 (02) : 348 - 350
  • [32] IMPUTATION OF MISSING DATA
    Lunt, M.
    [J]. ANNALS OF THE RHEUMATIC DISEASES, 2014, 73 : 49 - 49
  • [33] A nonparametric multiple imputation approach for missing categorical data
    Zhou, Muhan
    He, Yulei
    Yu, Mandi
    Hsu, Chiu-Hsieh
    [J]. BMC MEDICAL RESEARCH METHODOLOGY, 2017, 17
  • [34] Tree-based Approach to Missing Data Imputation
    Vateekul, Peerapon
    Sarinnapakorn, Kanoksri
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2009), 2009, : 70 - +
  • [35] A nonparametric multiple imputation approach for missing categorical data
    Muhan Zhou
    Yulei He
    Mandi Yu
    Chiu-Hsieh Hsu
    [J]. BMC Medical Research Methodology, 17
  • [36] Multiple imputation: a mature approach to dealing with missing data
    S. Chevret
    S. Seaman
    M. Resche-Rigon
    [J]. Intensive Care Medicine, 2015, 41 : 348 - 350
  • [37] Missing Categorical Data Imputation Approach Based on Similarity
    Wu, Sen
    Feng, Xiaodong
    Han, Yushan
    Wang, Qiang
    [J]. PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 2827 - 2832
  • [38] A First Approach on Big Data Missing Values Imputation
    Montesdeoca, Besay
    Luengo, Julian
    Maillo, Jesus
    Garcia-Gil, Diego
    Garcia, Salvador
    Herrera, Francisco
    [J]. PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON INTERNET OF THINGS, BIG DATA AND SECURITY (IOTBDS 2019), 2019, : 315 - 323
  • [39] A Deep Learning Framework for Traffic Data Imputation Considering Spatiotemporal Dependencies
    Jiang, Li
    Zhang, Ting
    Zuo, Qiruyi
    Tian, Chenyu
    Chan, George P.
    Victor Chan, Wai Kin
    [J]. 2022 IEEE 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION ENGINEERING, ICITE, 2022, : 14 - 19
  • [40] THE PROBLEM OF IMPUTATION OF THE MISSING DATA FROM THE CONTINUOUS COUNTS OF ROAD TRAFFIC
    Splawinska, M.
    [J]. ARCHIVES OF CIVIL ENGINEERING, 2015, 61 (01) : 131 - 145