Improvement of random forest by multiple imputation applied to tower crane accident prediction with missing data

被引:14
|
作者
Jiang, Ling [1 ]
Zhao, Tingsheng [1 ]
Feng, Chuxuan [1 ]
Zhang, Wei [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
基金
国家重点研发计划;
关键词
Random forest; Tower crane; Missing data; Accident prediction; Multiple imputation; CONSTRUCTION SITES; SAFETY; SELECTION;
D O I
10.1108/ECAM-07-2021-0606
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Purpose This research is aimed at predicting tower crane accident phases with incomplete data. Design/methodology/approach The tower crane accidents are collected for prediction model training. Random forest (RF) is used to conduct prediction. When there are missing values in the new inputs, they should be filled in advance. Nevertheless, it is difficult to collect complete data on construction site. Thus, the authors use multiple imputation (MI) method to improve RF. Finally the prediction model is applied to a case study. Findings The results show that multiple imputation RF (MIRF) can effectively predict tower crane accident when the data are incomplete. This research provides the importance rank of tower crane safety factors. The critical factors should be focused on site, because the missing data affect the prediction results seriously. Also the value of critical factors influences the safety of tower crane. Practical implication This research promotes the application of machine learning methods for accident prediction in actual projects. According to the onsite data, the authors can predict the accident phase of tower crane. The results can be used for tower crane accident prevention. Originality/value Previous studies have seldom predicted tower crane accidents, especially the phase of accident. This research uses tower crane data collected on site to predict the phase of the tower crane accident. The incomplete data collection is considered in this research according to the actual situation.
引用
收藏
页码:1222 / 1242
页数:21
相关论文
共 50 条
  • [21] Multiple imputation for nonignorable missing data
    Jongho Im
    Soeun Kim
    Journal of the Korean Statistical Society, 2017, 46 : 583 - 592
  • [22] Missing data analysis in cognitive diagnostic models: Random forest threshold imputation method
    You Xiaofeng
    Yang Jianqin
    Qin Chunying
    Liu Hongyun
    ACTA PSYCHOLOGICA SINICA, 2023, 55 (07) : 1192 - 1206
  • [23] Multiple imputation of missing data under missing at random: compatible imputation models are not sufficient to avoid bias if they are mis-specified
    Curnow, Elinor
    Capenter, James R.
    Heron, Jon E.
    Cornish, Rosie P.
    Rach, Stefan
    Didelez, Vanessa
    Langeheine, Malte
    Tilling, Kate
    JOURNAL OF CLINICAL EPIDEMIOLOGY, 2023, 160 : 100 - 109
  • [24] Missing data in association studies: a multiple imputation approach applied to case/control and family data
    Croiseau, P
    Genin, E
    Cordell, HJ
    GENETIC EPIDEMIOLOGY, 2005, 29 (03) : 241 - 241
  • [25] Cost-effectiveness analysis of clinical trials with missing data: using multiple imputation to address data missing not at random
    Leurent, Baptiste
    Gomes, Manuel
    Carpenter, James
    TRIALS, 2017, 18
  • [26] Missing value imputation for the analysis of incomplete traffic accident data
    Deb, Rupam
    Liew, Alan Wee -Chung
    INFORMATION SCIENCES, 2016, 339 : 274 - 289
  • [27] Multiple imputation of missing data for survey data analysis
    Lupo, Coralie
    Le Bouquin, Sophie
    Michel, Virginie
    Colin, Pierre
    Chauvin, Claire
    EPIDEMIOLOGIE ET SANTE ANIMALE, 2008, NO 53, 2008, (53): : 73 - 83
  • [28] A multiple imputation-based sensitivity analysis approach for data subject to missing not at random
    Hsu, Chiu-Hsieh
    He, Yulei
    Hu, Chengcheng
    Zhou, Wei
    STATISTICS IN MEDICINE, 2020, 39 (26) : 3756 - 3771
  • [29] Multiple imputation for missing data: a brief introduction
    Baccini, Michela
    EPIDEMIOLOGIA & PREVENZIONE, 2008, 32 (03): : 162 - 163
  • [30] Multiple imputation for missing data - A cautionary tale
    Allison, PD
    SOCIOLOGICAL METHODS & RESEARCH, 2000, 28 (03) : 301 - 309