Multiple Imputation for Incomplete Traffic Accident Data Using Chained Equations

被引:0
|
作者
Li, Linchao [1 ]
Zhang, Jian [1 ]
Wang, Yonggang [2 ]
Ran, Bin [1 ]
机构
[1] Southeast Univ, Sch Transportat, Nanjing, Jiangsu, Peoples R China
[2] Changan Univ, Sch Highway, Xian, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
imputation model; missing values; recovery; traffic safety; METHODOLOGICAL ALTERNATIVES; STATISTICAL-ANALYSIS; MISSING VALUES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Missing value in traffic accident data prevents the discovery of the significant factors to reduce accident severity and even lead to an invalid conclusion. In previous studies, to handle this problem, researchers mainly tried to improve the methodologies to fit the incomplete data. In this paper, we propose a missing value imputation method. It can impute missing values in the traffic accident data set. The method is called multiple imputation by chained equations (MICE) which is flexible and practical. It can not only cope with univariate missing values but also multivariate missing values. The proposed algorithm is compared with two traditional imputation methods using two publicly available traffic accident datasets from New York. Furthermore, we test the performance of the model with different missing ratios. The imputations for continuous variables and discrete variables are analyzed separately. The results indicate that our proposed model outperforms the other two models under almost all situations.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] A stacked approach for chained equations multiple imputation incorporating the substantive model
    Beesley, Lauren J.
    Taylor, Jeremy M. G.
    BIOMETRICS, 2021, 77 (04) : 1342 - 1354
  • [22] A multiple imputation strategy for incomplete longitudinal data
    Landrum, MB
    Becker, MP
    STATISTICS IN MEDICINE, 2001, 20 (17-18) : 2741 - 2760
  • [23] Multiple imputation for incomplete data with semicontinuous variables
    Javaras, KN
    Van Dyk, DA
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2003, 98 (463) : 703 - 715
  • [24] Multiple Imputation for Incomplete Data in Epidemiologic Studies
    Harel, Ofer
    Mitchell, Emily M.
    Perkins, Neil J.
    Cole, Stephen R.
    Tchetgen, Eric J. Tchetgen
    Sun, BaoLuo
    Schisterman, Enrique F.
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 2018, 187 (03) : 576 - 584
  • [25] Multiple imputation of incomplete multilevel data using Heckman selection models
    Munoz, Johanna
    Efthimiou, Orestis
    Audigier, Vincent
    de Jong, Valentijn M. T.
    Debray, Thomas P. A.
    STATISTICS IN MEDICINE, 2024, 43 (03) : 514 - 533
  • [26] MULTIPLE IMPUTATION OF INCOMPLETE CATEGORICAL DATA USING LATENT CLASS ANALYSIS
    Vermunt, Jeroen K.
    van Ginkel, Joost R.
    van der Ark, L. Andries
    Sijtsma, Klaas
    SOCIOLOGICAL METHODOLOGY, VOL 38, 2008, 38 : 369 - 397
  • [27] Recursive partitioning on incomplete data using surrogate decisions and multiple imputation
    Hapfelmeier, A.
    Hothorn, T.
    Ulm, K.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2012, 56 (06) : 1552 - 1565
  • [29] Generalized Additive Model Multiple Imputation by Chained Equations With Package Impute Robust
    Salfran, Daniel
    Spiess, Martin
    R JOURNAL, 2018, 10 (01): : 61 - 72
  • [30] Ranking contributors to traffic crashes on mountainous freeways from an incomplete dataset: A sequential approach of multivariate imputation by chained equations and random forest classifier
    Li, Linchao
    Prato, Carlo G.
    Wang, Yonggang
    ACCIDENT ANALYSIS AND PREVENTION, 2020, 146 (146):