Data-driven automatic classification model for construction accident cases using natural language processing with hyperparameter tuning

被引:0
|
作者
Kumi, Louis [1 ]
Jeong, Jaewook [1 ,3 ]
Jeong, Jaemin [1 ,2 ]
机构
[1] Seoul Natl Univ Sci & Technol, Dept Safety Engn, Seoul 01811, South Korea
[2] Univ Toronto, Dept Civil & Mineral Engn, Toronto, ON, Canada
[3] Seoul Natl Univ Sci & Technol, 232 Gongneung Ro, Seoul 01811, South Korea
关键词
Accident classification; Korean NLP; Machine learning; Accident type; Facility type; Work type; SAFETY MANAGEMENT;
D O I
10.1016/j.autcon.2024.105458
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
The construction industry, while vital to societal progress, is marred by a high incidence of accidents and injuries. Manual classification of accident cases is intensive and susceptible to human bias. This study addresses this challenge by developing an automated accident case classification system for the construction industry using Natural Language Processing and machine learning techniques. This study was conducted using the following steps: (1) Establishment of dataset (2) Korean Natural Language Processing (3) Selection of machine learning models (4) Model evaluation. The models exhibited competitive performance, demonstrating high accuracy, precision, and recall rates across all classification tasks. XGBoost outperformed NB, SVM, and KNN for accident type, facility type, and work type with accuracy of 0.80, 0.56, and 0.67, respectively. The results also provided insights into the factors influencing accident classification. This study contributes to construction safety by providing a data-driven foundation for safety decision-making, resource allocation, and benchmarking.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Automatic Tuning for Data-driven Model Predictive Control
    Edwards, William
    Tang, Gao
    Mamakoukas, Giorgos
    Murphey, Todd
    Hauser, Kris
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 7379 - 7385
  • [2] Automatic Corpus Extension for Data-driven Natural Language Generation
    Manishina, Elena
    Jabaian, Bassam
    Huet, Stephane
    Lefevre, Fabrice
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 3624 - 3631
  • [3] Automatic Review of Construction Specifications Using Natural Language Processing
    Moon, Seonghyeon
    Lee, Gitaek
    Chi, Seokho
    Oh, Hyunchul
    [J]. COMPUTING IN CIVIL ENGINEERING 2019: DATA, SENSING, AND ANALYTICS, 2019, : 401 - 407
  • [4] Automatic Classification of Data-Driven Respiratory Waveforms Using AI
    Walker, M. D.
    Su, K.
    Wollenweber, S. D.
    Johnsen, R.
    McGowan, D. R.
    [J]. EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2020, 47 (SUPPL 1) : S485 - S485
  • [5] Towards data-driven medical imaging using natural language processing in patients with suspected urolithiasis
    Jungmann, Florian
    Kaempgen, Benedikt
    Mildenberger, Philipp
    Tsaur, Igor
    Jorg, Tobias
    Dueber, Christoph
    Mildenberger, Peter
    Kloeckner, Roman
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2020, 137
  • [6] Accident Case Retrieval and Analyses: Using Natural Language Processing in the Construction Industry
    Kim, Taekhyung
    Chi, Seokho
    [J]. JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT, 2019, 145 (03)
  • [7] Automatic sleep classification using a data-driven topic model reveals latent sleep states
    Koch, Henriette
    Christensen, Julie A. E.
    Frandsen, Rune
    Zoetmulder, Marielle
    Arvastson, Lars
    Christensen, Soren R.
    Jennum, Poul
    Sorensen, Helge B. D.
    [J]. JOURNAL OF NEUROSCIENCE METHODS, 2014, 235 : 130 - 137
  • [8] Data-driven materials research enabled by natural language processing and information extraction
    Olivetti, Elsa A.
    Cole, Jacqueline M.
    Kim, Edward
    Kononova, Olga
    Ceder, Gerbrand
    Han, Thomas Yong-Jin
    Hiszpanski, Anna M.
    [J]. APPLIED PHYSICS REVIEWS, 2020, 7 (04)
  • [9] Toward an Automatic Classification of Negotiation Styles using Natural Language Processing
    Pacella, Daniela
    Dell'Aquila, Elena
    Marocco, Davide
    Furnell, Steven
    [J]. INTELLIGENT VIRTUAL AGENTS, IVA 2017, 2017, 10498 : 339 - 342
  • [10] On cross-language experiments and data-driven units for ALISP (Automatic Language Independent Speech Processing)
    Constantinescu, A
    Chollet, G
    [J]. 1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 606 - 613