Analysis of Severe Injuries in Crashes Involving Large Trucks Using K-Prototypes Clustering-Based GBDT Model

被引:6
|
作者
Tahfim, Syed As-Sadeq [1 ]
Yan, Chen [1 ]
机构
[1] Dalian Maritime Univ, Dept Maritime Econ & Management, Linghai Rd, Dalian 116026, Peoples R China
关键词
large trucks; severe injuries; heterogeneity; k-prototypes; clustering; GBDT; machine learning; LATENT CLASS; TRAFFIC ACCIDENTS; VEHICLE CRASHES; RURAL HIGHWAYS; SINGLE; SEVERITIES; CLASSIFICATION; ALGORITHM; ACCURACY; TREE;
D O I
10.3390/safety7020032
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
The unobserved heterogeneity in traffic crash data hides certain relationships between the contributory factors and injury severity. The literature has been limited in exploring different types of clustering methods for the analysis of the injury severity in crashes involving large trucks. Additionally, the variability of data type in traffic crash data has rarely been addressed. This study explored the application of the k-prototypes clustering method to countermeasure the unobserved heterogeneity in large truck-involved crashes that had occurred in the United States between the period of 2016 to 2019. The study segmented the entire dataset (EDS) into three homogeneous clusters. Four gradient boosted decision trees (GBDT) models were developed on the EDS and individual clusters to predict the injury severity in crashes involving large trucks. The list of input features included crash characteristics, truck characteristics, roadway attributes, time and location of the crash, and environmental factors. Each cluster-based GBDT model was compared with the EDS-based model. Two of the three cluster-based models showed significant improvement in their predicting performances. Additionally, feature analysis using the SHAP (Shapley additive explanations) method identified few new important features in each cluster and showed that some features have a different degree of effects on severe injuries in the individual clusters. The current study concluded that the k-prototypes clustering-based GBDT model is a promising approach to reveal hidden insights, which can be used to improve safety measures, roadway conditions and policies for the prevention of severe injuries in crashes involving large trucks.
引用
收藏
页数:18
相关论文
共 9 条
  • [1] Identification and analysis of vulnerable populations for malaria based on K-prototypes clustering
    Li, Chenlu
    Wu, Xiaoxu
    Cheng, Xiao
    Fan, Cheng
    Li, Zhixin
    Fang, Hui
    Shi, Chunming
    [J]. ENVIRONMENTAL RESEARCH, 2019, 176
  • [2] A Cluster-Based Approach for Analysis of Injury Severity in Interstate Crashes Involving Large Trucks
    Tahfim, Syed As-Sadeq
    Chen, Yan
    [J]. SUSTAINABILITY, 2022, 14 (21)
  • [3] Analyzing injury severity of rear-end crashes involving large trucks using a mixed logit model: A case study in North Carolina
    Liu, Pengfei
    Fan, Wei
    [J]. JOURNAL OF TRANSPORTATION SAFETY & SECURITY, 2022, 14 (05) : 723 - 736
  • [4] Fuzzy Clustering-Based Efficient Classification Model for Large TCP Dump Dataset Using Hadoop Framework
    Budhraja, Tarun
    Goyal, Bhavya
    Kilaru, Aravind
    Sikarwar, Vivek
    [J]. PROCEEDINGS OF INTERNATIONAL CONFERENCE ON ICT FOR SUSTAINABLE DEVELOPMENT, ICT4SD 2015, VOL 1, 2016, 408 : 427 - 437
  • [5] K-Means Clustering-Based Safety System in Large-Scale Industrial Site Using Industrial Wireless Sensor Networks
    Seo, Dongyeong
    Kim, Sangdae
    Oh, Seungmin
    Kim, Sang-Ha
    [J]. SENSORS, 2022, 22 (08)
  • [6] A k-core decomposition-based opinion leaders identifying method and clustering-based consensus model for large-scale group decision making
    Gao, Pengqun
    Huang, Jing
    Xu, Yejun
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2020, 150 (150)
  • [7] Dynamic clustering-based model reduction scheme for damping control of large power systems using series compensators from wide area signals
    Ranjbar, Soheil
    Al-Sumaiti, Ameena S.
    Sangrody, Reza
    Byon, Young-Ji
    Marzband, Mousa
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2021, 131
  • [8] A Clustering-Based Computational Model to Group Students With Similar Programming Skills From Automatic Source Code Analysis Using Novel Features
    Silva, Davi Bernardo
    Carvalho, Deborah Ribeiro
    Silla Jr, Carlos N.
    [J]. IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, 2024, 17 : 428 - 444
  • [9] A k-core decomposition-based opinion leaders identifying method an clustering-based consensus model for large-scale group decision making (vol 150, 106842, 2020)
    Gao, Pengqun
    Huang, Jing
    Xu, Yejun
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2023, 179