Improvement of random forest by multiple imputation applied to tower crane accident prediction with missing data

被引：14

作者：

Jiang, Ling ^{[1
]}

Zhao, Tingsheng ^{[1
]}

Feng, Chuxuan ^{[1
]}

Zhang, Wei ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China

来源：

ENGINEERING CONSTRUCTION AND ARCHITECTURAL MANAGEMENT | 2023年 / 30卷 / 03期

基金：

国家重点研发计划;

关键词：

Random forest; Tower crane; Missing data; Accident prediction; Multiple imputation; CONSTRUCTION SITES; SAFETY; SELECTION;

D O I：

10.1108/ECAM-07-2021-0606

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Purpose This research is aimed at predicting tower crane accident phases with incomplete data. Design/methodology/approach The tower crane accidents are collected for prediction model training. Random forest (RF) is used to conduct prediction. When there are missing values in the new inputs, they should be filled in advance. Nevertheless, it is difficult to collect complete data on construction site. Thus, the authors use multiple imputation (MI) method to improve RF. Finally the prediction model is applied to a case study. Findings The results show that multiple imputation RF (MIRF) can effectively predict tower crane accident when the data are incomplete. This research provides the importance rank of tower crane safety factors. The critical factors should be focused on site, because the missing data affect the prediction results seriously. Also the value of critical factors influences the safety of tower crane. Practical implication This research promotes the application of machine learning methods for accident prediction in actual projects. According to the onsite data, the authors can predict the accident phase of tower crane. The results can be used for tower crane accident prevention. Originality/value Previous studies have seldom predicted tower crane accidents, especially the phase of accident. This research uses tower crane data collected on site to predict the phase of the tower crane accident. The incomplete data collection is considered in this research according to the actual situation.

引用

页码：1222 / 1242

页数：21

共 50 条

[31] The use of multiple imputation for the analysis of missing data
Sinharay, S
Stern, HS
Russell, D
PSYCHOLOGICAL METHODS, 2001, 6 (04) : 317 - 329
[32] Introduction to multiple imputation for dealing with missing data
Lee, Katherine J.
Simpson, Julie A.
RESPIROLOGY, 2014, 19 (02) : 162 - 167
[33] Siamese Autoencoder Architecture for the Imputation of Data Missing Not at Random
Pereira, Ricardo Cardoso
Abreu, Pedro Henriques
Rodrigues, Pedro Pereira
JOURNAL OF COMPUTATIONAL SCIENCE, 2024, 78
[34] Regression multiple imputation for missing data analysis
Yu, Lili
Liu, Liang
Peace, Karl E.
STATISTICAL METHODS IN MEDICAL RESEARCH, 2020, 29 (09) : 2647 - 2664
[35] Identifiable Generative Models for Missing Not at Random Data Imputation
Ma, Chao
Zhang, Cheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[36] Deep Generative Imputation Model for Missing Not At Random Data
Chen, Jialei
Xu, Yuanbo
Wang, Pengyang
Yang, Yongjian
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 316 - 325
[37] Efficient random imputation for missing data in complex surveys
Chen, J
Rao, JNK
Sitter, RR
STATISTICA SINICA, 2000, 10 (04) : 1153 - 1169
[38] CLASSIFIERS ACCURACY IMPROVEMENT BASED ON MISSING DATA IMPUTATION
Jordanov, Ivan
Petrov, Nedyalko
Petrozziello, Alessio
JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2018, 8 (01) : 31 - 48
[39] Using multiple imputation to estimate cumulative distribution functions in longitudinal data analysis with data missing at random
Dinh, Phillip
PHARMACEUTICAL STATISTICS, 2013, 12 (05) : 260 - 267
[40] Wind power prediction with missing data using Gaussian process regression and multiple imputation
Liu, Tianhong
Wei, Haikun
Zhang, Kanjian
APPLIED SOFT COMPUTING, 2018, 71 : 905 - 916

← 1 2 3 4 5 →