Improvement of random forest by multiple imputation applied to tower crane accident prediction with missing data

被引:14
|
作者
Jiang, Ling [1 ]
Zhao, Tingsheng [1 ]
Feng, Chuxuan [1 ]
Zhang, Wei [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
基金
国家重点研发计划;
关键词
Random forest; Tower crane; Missing data; Accident prediction; Multiple imputation; CONSTRUCTION SITES; SAFETY; SELECTION;
D O I
10.1108/ECAM-07-2021-0606
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Purpose This research is aimed at predicting tower crane accident phases with incomplete data. Design/methodology/approach The tower crane accidents are collected for prediction model training. Random forest (RF) is used to conduct prediction. When there are missing values in the new inputs, they should be filled in advance. Nevertheless, it is difficult to collect complete data on construction site. Thus, the authors use multiple imputation (MI) method to improve RF. Finally the prediction model is applied to a case study. Findings The results show that multiple imputation RF (MIRF) can effectively predict tower crane accident when the data are incomplete. This research provides the importance rank of tower crane safety factors. The critical factors should be focused on site, because the missing data affect the prediction results seriously. Also the value of critical factors influences the safety of tower crane. Practical implication This research promotes the application of machine learning methods for accident prediction in actual projects. According to the onsite data, the authors can predict the accident phase of tower crane. The results can be used for tower crane accident prevention. Originality/value Previous studies have seldom predicted tower crane accidents, especially the phase of accident. This research uses tower crane data collected on site to predict the phase of the tower crane accident. The incomplete data collection is considered in this research according to the actual situation.
引用
收藏
页码:1222 / 1242
页数:21
相关论文
共 50 条
  • [31] The use of multiple imputation for the analysis of missing data
    Sinharay, S
    Stern, HS
    Russell, D
    PSYCHOLOGICAL METHODS, 2001, 6 (04) : 317 - 329
  • [32] Introduction to multiple imputation for dealing with missing data
    Lee, Katherine J.
    Simpson, Julie A.
    RESPIROLOGY, 2014, 19 (02) : 162 - 167
  • [33] Siamese Autoencoder Architecture for the Imputation of Data Missing Not at Random
    Pereira, Ricardo Cardoso
    Abreu, Pedro Henriques
    Rodrigues, Pedro Pereira
    JOURNAL OF COMPUTATIONAL SCIENCE, 2024, 78
  • [34] Regression multiple imputation for missing data analysis
    Yu, Lili
    Liu, Liang
    Peace, Karl E.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2020, 29 (09) : 2647 - 2664
  • [35] Identifiable Generative Models for Missing Not at Random Data Imputation
    Ma, Chao
    Zhang, Cheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [36] Deep Generative Imputation Model for Missing Not At Random Data
    Chen, Jialei
    Xu, Yuanbo
    Wang, Pengyang
    Yang, Yongjian
    PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 316 - 325
  • [37] Efficient random imputation for missing data in complex surveys
    Chen, J
    Rao, JNK
    Sitter, RR
    STATISTICA SINICA, 2000, 10 (04) : 1153 - 1169
  • [38] CLASSIFIERS ACCURACY IMPROVEMENT BASED ON MISSING DATA IMPUTATION
    Jordanov, Ivan
    Petrov, Nedyalko
    Petrozziello, Alessio
    JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2018, 8 (01) : 31 - 48
  • [39] Using multiple imputation to estimate cumulative distribution functions in longitudinal data analysis with data missing at random
    Dinh, Phillip
    PHARMACEUTICAL STATISTICS, 2013, 12 (05) : 260 - 267
  • [40] Wind power prediction with missing data using Gaussian process regression and multiple imputation
    Liu, Tianhong
    Wei, Haikun
    Zhang, Kanjian
    APPLIED SOFT COMPUTING, 2018, 71 : 905 - 916