Improvement of random forest by multiple imputation applied to tower crane accident prediction with missing data

被引:14
|
作者
Jiang, Ling [1 ]
Zhao, Tingsheng [1 ]
Feng, Chuxuan [1 ]
Zhang, Wei [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Peoples R China
基金
国家重点研发计划;
关键词
Random forest; Tower crane; Missing data; Accident prediction; Multiple imputation; CONSTRUCTION SITES; SAFETY; SELECTION;
D O I
10.1108/ECAM-07-2021-0606
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Purpose This research is aimed at predicting tower crane accident phases with incomplete data. Design/methodology/approach The tower crane accidents are collected for prediction model training. Random forest (RF) is used to conduct prediction. When there are missing values in the new inputs, they should be filled in advance. Nevertheless, it is difficult to collect complete data on construction site. Thus, the authors use multiple imputation (MI) method to improve RF. Finally the prediction model is applied to a case study. Findings The results show that multiple imputation RF (MIRF) can effectively predict tower crane accident when the data are incomplete. This research provides the importance rank of tower crane safety factors. The critical factors should be focused on site, because the missing data affect the prediction results seriously. Also the value of critical factors influences the safety of tower crane. Practical implication This research promotes the application of machine learning methods for accident prediction in actual projects. According to the onsite data, the authors can predict the accident phase of tower crane. The results can be used for tower crane accident prevention. Originality/value Previous studies have seldom predicted tower crane accidents, especially the phase of accident. This research uses tower crane data collected on site to predict the phase of the tower crane accident. The incomplete data collection is considered in this research according to the actual situation.
引用
收藏
页码:1222 / 1242
页数:21
相关论文
共 50 条
  • [41] An evaluation of methods to handle missing data in the context of latent variable interaction analysis: multiple imputation, maximum likelihood, and random forest algorithm
    Shin, Tacksoo
    Long, Jeffrey D.
    Davison, Mark L.
    JAPANESE JOURNAL OF STATISTICS AND DATA SCIENCE, 2022, 5 (02) : 629 - 659
  • [42] An evaluation of methods to handle missing data in the context of latent variable interaction analysis: multiple imputation, maximum likelihood, and random forest algorithm
    Tacksoo Shin
    Jeffrey D. Long
    Mark L. Davison
    Japanese Journal of Statistics and Data Science, 2022, 5 : 629 - 659
  • [43] Random forest missing data algorithms
    Tang, Fei
    Ishwaran, Hemant
    STATISTICAL ANALYSIS AND DATA MINING, 2017, 10 (06) : 363 - 377
  • [44] A New Missing Data Imputation Algorithm Applied to Electrical Data Loggers
    Crespo Turrado, Concepcion
    Sanchez Lasheras, Fernando
    Luis Calvo-Rolle, Jose
    Jose Pinon-Pazos, Andres
    de Cos Juez, Francisco Javier
    SENSORS, 2015, 15 (12) : 31069 - 31082
  • [45] Imputation of missing clinical, cognitive and neuroimaging data of Dementia using missForest, a Random Forest based algorithm
    Aracri, Federica
    Bianco, Maria Giovanna
    Quattrone, Andrea
    Sarica, Alessia
    2023 IEEE 36TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, CBMS, 2023, : 684 - 688
  • [46] Attempts Prediction by Missing Data Imputation in Engineering Degree
    Jove, Esteban
    Blanco-Rodriguez, Patricia
    Luis Casteleiro-Roca, Jose
    Moreno-Arboleda, Javier
    Antonio Lopez-Vazquez, Jose
    de Cos Juez, Francisco Javier
    Luis Calvo-Rolle, Jose
    INTERNATIONAL JOINT CONFERENCE SOCO'17- CISIS'17-ICEUTE'17 PROCEEDINGS, 2018, 649 : 167 - 176
  • [47] Comparison of Random Forest and Parametric Imputation Models for Imputing Missing Data Using MICE: A CALIBER Study
    Shah, Anoop D.
    Bartlett, Jonathan W.
    Carpenter, James
    Nicholas, Owen
    Hemingway, Harry
    AMERICAN JOURNAL OF EPIDEMIOLOGY, 2014, 179 (06) : 764 - 774
  • [48] Taking 'don't knows' as valid responses: A multiple complete random imputation of missing data
    Kroh, M
    QUALITY & QUANTITY, 2006, 40 (02) : 225 - 244
  • [49] Taking ‘Don’t Knows’ as Valid Responses: A Multiple Complete Random Imputation of Missing Data
    Martin Kroh
    Quality and Quantity, 2006, 40 : 225 - 244
  • [50] Multiple imputation of missing at random data: General points and presentation of a Monte-Carlo method
    Cottrell, G.
    Cot, M.
    Mary, J. -Y.
    REVUE D EPIDEMIOLOGIE ET DE SANTE PUBLIQUE, 2009, 57 (05): : 361 - 372