Enhancing Crash Injury Severity Prediction on Imbalanced Crash Data by Sampling Technique with Variable Selection

被引:0
|
作者
Yahaya, Mahama
Jiang, Xinguo
Fu, Chuanyun
Bashir, Kamal
Fan, Wenbo
机构
关键词
Crash injury severity; Class imbalance; SMOTE; Variable selection; Wrapper; Filter; CLASSIFICATION;
D O I
暂无
中图分类号
U [交通运输];
学科分类号
08 ; 0823 ;
摘要
The analysis of road crash data has long been used as a premise for influencing the road and automobile designs and guiding the implementation of various policies with the view to enhance the road safety. However, the crash data is associated with class imbalance and high dimensionality which may severely impact the predictions of the analysis model. This study suggests a framework for the combined use of variable selection and the synthetic minority over sampling (SMOTE) data balance techniques. We explored three variable selection (VS) techniques including two filter-based i.e., Chi-square (CS) and correlation feature selection (CFS) and an embedded method i.e., random forest (RF). To study the imbalance data problem and the implications for VS, two training data scenarios were considered: (1) VS based on original data and modelling based on original data (2) VS based on sampled data and modelling based on original data. The impact of varying the data class distribution was also examined. The Naive Bayes classifiers were trained on the various selection subsets and their predictions captured in two metrics types. Overall, eight models were developed and analysed. The empirical results demonstrate that using balanced data can be helpful to identify the most prolific predictors of the crash injury severity. The filter-based ranking methods are more robust against the data imbalance than the wrapper. The NB classifier produced better predictions on the optimal subsets identified by the filter-based method than the one chosen by the wrapper
引用
收藏
页码:363 / 368
页数:6
相关论文
共 50 条
  • [1] Handling Imbalanced Data in Road Crash Severity Prediction by Machine Learning Algorithms
    Fiorentini, Nicholas
    Losa, Massimo
    [J]. INFRASTRUCTURES, 2020, 5 (07)
  • [2] Classification of motor vehicle crash injury severity: A hybrid approach for imbalanced data
    Jeong, Heejin
    Jang, Youngchan
    Bowman, Patrick J.
    Masoud, Neda
    [J]. ACCIDENT ANALYSIS AND PREVENTION, 2018, 120 : 250 - 261
  • [3] Vehicular crash data used to rank intersections by injury crash frequency and severity
    Liu, Yi
    Li, Zongzhi
    Liu, Jingxian
    Patel, Harshingar
    [J]. DATA IN BRIEF, 2016, 8 : 930 - 933
  • [4] Handling Imbalanced Data for Real-Time Crash Prediction: Application of Boosting and Sampling Techniques
    Ariannezhad, Amin
    Karimpour, Abolfazl
    Qin, Xiao
    Wu, Yao-Jan
    Salmani, Yasamin
    [J]. JOURNAL OF TRANSPORTATION ENGINEERING PART A-SYSTEMS, 2021, 147 (03)
  • [5] Prediction model of crash severity in imbalanced dataset using data leveling methods and metaheuristic optimization algorithms
    Danesh, Akbar
    Ehsani, Mehrdad
    Nejad, Fereidoon Moghadas
    Zakeri, Hamzeh
    [J]. INTERNATIONAL JOURNAL OF CRASHWORTHINESS, 2022, 27 (06) : 1869 - 1882
  • [6] Method for Modeling Crash Severity with Observable Crash Data
    Ray, Malcolm H.
    Carrigan, Christine E.
    Plaxico, Chuck A.
    [J]. TRANSPORTATION RESEARCH RECORD, 2014, (2437) : 1 - 9
  • [7] The crash injury severity prediction of traffic accident using an improved wrappers feature selection algorithm
    Wang, Shufeng
    Li, Zhihao
    Zhang, Junyou
    Yuan, Yadong
    Liu, Zhe
    [J]. INTERNATIONAL JOURNAL OF CRASHWORTHINESS, 2022, 27 (03) : 910 - 921
  • [8] A novel generative adversarial network for improving crash severity modeling with imbalanced data
    Chen, Junlan
    Pu, Ziyuan
    Zheng, Nan
    Wen, Xiao
    Ding, Hongliang
    Guo, Xiucheng
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2024, 164
  • [9] Application of GLASSO in Variable Selection and Crash Prediction at Unsignalized Intersections
    Haleem, Kirolos
    Abdel-Aty, Mohamed
    [J]. JOURNAL OF TRANSPORTATION ENGINEERING, 2012, 138 (07) : 949 - 960
  • [10] A DATA MINING APPROACH TO ANALYSE CRASH INJURY SEVERITY LEVEL
    Lee, Angela Siew Hoong
    Yap, Ling Sze
    Chua, Hui Na
    Low, Yeh Ching
    Ismail, Maizatul Akmar
    [J]. JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2021, 16 : 1 - 14