Intelligent mining of safety hazard information from construction documents using semantic similarity and information entropy

被引:5
|
作者
Tian, Dan [1 ]
Li, Mingchao [1 ]
Shen, Yang [2 ]
Han, Shuai [1 ,3 ]
机构
[1] Tianjin Univ, State Key Lab Hydraul Engn Simulat & Safety, Tianjin 300350, Peoples R China
[2] China Three Gorges Corp, Beijing 100038, Peoples R China
[3] Hong Kong Polytech Univ, Dept Bldg & Real Estate, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Construction documents; Safety hazards; Information mining; Semantic similarity; Word2vec; Information entropy; MUTUAL INFORMATION; TF-IDF; IDENTIFICATION; EXTRACTION; SYSTEM; MODEL;
D O I
10.1016/j.engappai.2022.105742
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Project construction on-site is known to be very dangerous workplace environments due to large numbers of safety hazards. Analysis of construction safety hazards is essential to formulate rational safety management plans and prevent accidents. Construction documents contain large volumes of safety hazard information available for analysis. However, such analyses are challenging because the safety hazard information in the construction documents is presented in an unstructured or semi-structured format. This study proposes a method for intelligent mining of safety hazard information, which comprises safety hazard technical term recognition and safety hazard information analysis. The safety hazard technical term recognition model is developed based on semantic similarity and information correlation to build a safety hazard technical term library. The safety hazard information based on the technical term library is mined and analyzed using the term frequency-inverse document frequency method (TF-IDF). Finally, the proposed method is applied to build the safety hazard technical term library, which contains 2697 technical terms, and develop a hydraulic project construction safety hazard analysis system, which can realize the intelligent recognition and application of technical terms. Meanwhile, this system can automatically extract safety hazard information and provide a visualization interface to intuitively show the safety hazard analysis results, which improves the extraction efficiency of safety hazard information. The study provides a new approach for recognizing technical terms and mining safety hazard information, which can lead to enhancing management efficiency and practical knowledge discovery for safety management.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] An intelligent system for semantic information retrieval information from textual web documents
    Karthik, Mukundan
    Marikkannan, Mariappan
    Kannan, Arputharaj
    COMPUTATIONAL FORENSICS, PROCEEDINGS, 2008, 5158 : 135 - +
  • [2] A hybrid deep semantic mining method considering fuzzy expressions for the automatic recognition of construction safety hazard information
    Zhang, Xiaojian
    Tian, Dan
    Ren, Qiubing
    Li, Mingchao
    Shen, Yang
    Han, Shuai
    ADVANCED ENGINEERING INFORMATICS, 2024, 61
  • [3] Intelligent question answering method for construction safety hazard knowledge based on deep semantic mining
    Tian, Dan
    Li, Mingchao
    Ren, Qiubing
    Zhang, Xiaojian
    Han, Shuai
    Shen, Yang
    AUTOMATION IN CONSTRUCTION, 2023, 145
  • [4] Proactive safety hazard identification using visual-text - text semantic similarity for construction safety management
    Wang, Yiheng
    Xiao, Bo
    Bouferguene, Ahmed
    Al-Hussein, Mohamed
    AUTOMATION IN CONSTRUCTION, 2024, 166
  • [5] Semantic information extraction from Tamil documents
    Pandian, S. Lakshmana
    Devakumar, J.
    Geetha, T.V.
    International Journal of Metadata, Semantics and Ontologies, 2008, 3 (03) : 226 - 232
  • [6] Audio information retrieval using semantic similarity
    Barrington, Luke
    Chan, Antoni
    Turnbull, Douglas
    Lanckriet, Gert
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL II, PTS 1-3, 2007, : 725 - +
  • [7] Semantic Structuring of and Information Extraction from Medical Documents Using the UMLS
    Denecke, K.
    METHODS OF INFORMATION IN MEDICINE, 2008, 47 (05) : 425 - 434
  • [8] Exhaustive mining of information from unstructured documents
    Soubbotin, Martin
    Soubbotin, Sergei
    WMSCI 2005: 9TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL 1, 2005, : 210 - 215
  • [9] Arabic Information Retrieval Using Semantic Analysis of Documents
    Al-Maghasbeh, Mohammad Khaled A.
    Bin Hamzah, Mohd Pouzi
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (05): : 53 - 58
  • [10] Information content measures of semantic similarity between documents based on Hadoop system
    Birjali, Marouane
    Beni-Hssane, Abderrahim
    Erritali, Mohammed
    Madani, Youness
    2016 INTERNATIONAL CONFERENCE ON WIRELESS NETWORKS AND MOBILE COMMUNICATIONS (WINCOM), 2016, : P187 - P192