A Study on Data Pre-Processing and Accident Prediction Modelling for Occupational Accident Analysis in the Construction Industry

被引:15
|
作者
Lee, Jae Yun [1 ]
Yoon, Young Geun [1 ]
Oh, Tae Keun [1 ,2 ]
Park, Seunghee [3 ]
Ryu, Sang Il [4 ]
机构
[1] Incheon Natl Univ, Dept Safety Engn, Incheon 22012, South Korea
[2] Incheon Natl Univ, Res Inst Engn & Technol, Incheon 22012, South Korea
[3] Sungkyunkwan Univ, Sch Civil Architectural Engn & Landscape Architec, Gyeonggi 440746, South Korea
[4] Dong Eui Univ, Dept Fire Adm & Disaster Management, 176 Eomgwang Ro, Busan 47340, South Korea
来源
APPLIED SCIENCES-BASEL | 2020年 / 10卷 / 21期
基金
新加坡国家研究基金会;
关键词
occupational accident; correlation analysis; support vector machine; ensemble; data preprocessing; latent class clustering analysis; alluvial flow diagram; SAFETY MANAGEMENT; DECISION; TREES; RISK;
D O I
10.3390/app10217949
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In the construction industry, it is difficult to predict occupational accidents because various accident characteristics arise simultaneously and organically in different types of work. Furthermore, even when analyzing occupational accident data, it is difficult to deduce meaningful results because the data recorded by the incident investigator are qualitative and include a wide variety of data types and categories. Recently, numerous studies have used machine learning to analyze the correlations in such complex construction accident data; however, heretofore the focus has been on predicting severity with various variables, and several limitations remain when deriving the correlations between features from various variables. Thus, this paper proposes a data processing procedure that can efficiently manipulate accident data using optimal machine learning techniques and derive and systematize meaningful variables to rationally approach such complex problems. In particular, among the various variables, the most influential variables are derived through methods such as clustering, chi-square, Cramer's V, and predictor importance; then, the analysis is simplified by optimally grouping the variables. For accident data with optimal variables and elements, a predictive model is constructed between variables, using a support vector machine and decision-tree-based ensemble; then, the correlation between the dependent and independent variables is analyzed through an alluvial flow diagram for several cases. Therefore, a new processing procedure has been introduced in data preprocessing and accident prediction modelling to overcome difficulties from complex and diverse construction occupational accident data, and effective accident prevention is possible by deriving correlations of construction accidents using this process.
引用
收藏
页码:1 / 23
页数:23
相关论文
共 50 条
  • [21] Text Data Pre-Processing for Time-series Modelling
    Pomenkova, Jitka
    Korab, Petr
    Strba, David
    [J]. 2023 33RD INTERNATIONAL CONFERENCE RADIOELEKTRONIKA, RADIOELEKTRONIKA, 2023,
  • [22] Accident Case Retrieval and Analyses: Using Natural Language Processing in the Construction Industry
    Kim, Taekhyung
    Chi, Seokho
    [J]. JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT, 2019, 145 (03)
  • [23] Accident Probability Prediction and Analysis of Bus Drivers Based on Occupational Characteristics
    Ding, Tongqiang
    Yuan, Lei
    Li, Zhiqiang
    Xi, Jianfeng
    Zhang, Kexin
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (01):
  • [24] ANALYSIS OF THE STATE OF THE ACCIDENT RATE IN THE CONSTRUCTION INDUSTRY IN EUROPEAN UNION COUNTRIES
    Hola, B.
    Szostak, M.
    [J]. ARCHIVES OF CIVIL ENGINEERING, 2015, 61 (04) : 19 - 34
  • [25] Accidents in the construction industry in the Netherlands: An analysis of accident reports using Storybuilder
    Ale, B. J. M.
    Bellamy, L. J.
    Baksteen, H.
    Damen, M.
    Goossens, L. H. J.
    Hale, A. R.
    Mud, M.
    Oh, J.
    Papazoglou, I. A.
    Whiston, J. Y.
    [J]. RELIABILITY ENGINEERING & SYSTEM SAFETY, 2008, 93 (10) : 1523 - 1533
  • [26] A Mathematical Modeling of Evaluating China's Construction Safety for Occupational Accident Analysis
    Ma, Qianwei
    Lusk, Jeffrey Wills
    Tan, Fabian Hadipriono
    Parke, Michael Edward
    Alhumaidi, Hanouf Mohammad
    Clark, Jordan Douglas
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (10):
  • [27] Pre-processing of chromatographic data for principal component analysis
    M. E. Pate
    N. F. Thornhill
    R. Chandwani
    M. Hoare
    N. J. Titchener-Hooker
    [J]. Bioprocess Engineering, 1998, 19 : 297 - 305
  • [28] Finding occupational accident patterns in the extractive industry using a systematic data mining approach
    Silva, Joaquim F.
    Jacinto, Celeste
    [J]. RELIABILITY ENGINEERING & SYSTEM SAFETY, 2012, 108 : 108 - 122
  • [29] A pre-processing and network analysis of GPS tracking data
    Abbruzzo, Antonino
    Ferrante, Mauro
    De Cantis, Stefano
    [J]. SPATIAL ECONOMIC ANALYSIS, 2021, 16 (02) : 217 - 240
  • [30] Pre-processing of chromatographic data for principal component analysis
    Pate, ME
    Thornhill, NF
    Chandwani, R
    Hoare, M
    Titchener-Hooker, NJ
    [J]. BIOPROCESS ENGINEERING, 1998, 19 (04) : 297 - 305