Automatic bridge inspection database construction through hybrid information extraction and large language models

被引:0
|
作者
Zhang, Chenhong [1 ]
Lei, Xiaoming [2 ]
Xia, Ye [1 ,3 ]
Sun, Limin [1 ,3 ]
机构
[1] Tongji Univ, Dept Bridge Engn, Shanghai, Peoples R China
[2] Hong Kong Polytech Univ, Dept Civil & Environm Engn, Hong Kong, Peoples R China
[3] Shanghai Qi Zhi Inst, Shanghai, Peoples R China
来源
基金
中国国家自然科学基金;
关键词
Bridge inspection data; Natural language processing; Information extraction; Large languge model; Pseudo label;
D O I
10.1016/j.dibe.2024.100549
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Regular bridge inspections generate extensive reports that, while critical for maintenance, often remain underutilized due to their unstructured format. Traditional information extraction methods depend on intricate labeling systems that commonly require time-consuming and labor-intensive labeling. This paper presents a novel bridge inspection database construction method leveraging LLM-assisted information extraction. First, we introduce the pseudo-labelling method using a closed-source LLM to generate high-quality data. Then we propose the hybrid extraction pipeline to extract relevant information segments and process them by a generation-based IE model, fine-tuned on pseudo-labeled data. Finally, the extracted data is used to construct the bridge inspection database. The proposed method, validated with real-world data, not only demonstrates higher extraction precision than the closed-source LLM used for pseudo-labeling but also outperforms traditional methods in both data preparation time and extraction accuracy. This approach provides a scalable solution for more proactive and data-driven bridge maintenance strategies.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Bridge information models for construction of a concrete box-girder bridge
    Lee, K. M.
    Lee, Y. B.
    Shim, C. S.
    Park, K. L.
    STRUCTURE AND INFRASTRUCTURE ENGINEERING, 2012, 8 (07) : 687 - 703
  • [22] Enriched Construction Regulation Inquiry Responses: A Hybrid Search Approach for Large Language Models
    He, Chuanni
    He, Weilin
    Liu, Min
    Leng, Shaolong
    Wei, Song
    JOURNAL OF MANAGEMENT IN ENGINEERING, 2025, 41 (03)
  • [23] Epidemic Information Extraction for Event-Based Surveillance Using Large Language Models
    Consoli, Sergio
    Markov, Peter
    Stilianakis, Nikolaos I.
    Bertolini, Lorenzo
    Gallardo, Antonio Puertas
    Ceresa, Mario
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 1, ICICT 2024, 2024, 1011 : 241 - 252
  • [24] Comparative Analysis of Large Language Models in Structured Information Extraction from Job Postings
    Sioziou, Kyriaki
    Zervas, Panagiotis
    Giotopoulos, Kostas
    Tzimas, Giannis
    ENGINEERING APPLICATIONS OF NEURAL NETWORKS, EANN 2024, 2024, 2141 : 82 - 92
  • [25] Towards normalized clinical information extraction in Chinese radiology report with large language models
    Xu, Qinwei
    Xu, Xingkun
    Zhou, Chenyi
    Liu, Zuozhu
    Huang, Feiyue
    Li, Shaoxin
    Zhu, Lifeng
    Bai, Zhian
    Xu, Yuchen
    Hu, Weiguo
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 271
  • [26] Automatic Construction of Predictive Neuron Models through Large Scale Assimilation of Electrophysiological Data
    Nogaret, Alain
    Meliza, C. Daniel
    Margoliash, Daniel
    Abarbanel, Henry D. I.
    SCIENTIFIC REPORTS, 2016, 6
  • [27] Automatic Construction of Predictive Neuron Models through Large Scale Assimilation of Electrophysiological Data
    Alain Nogaret
    C. Daniel Meliza
    Daniel Margoliash
    Henry D. I. Abarbanel
    Scientific Reports, 6
  • [28] Study of SML-based automatic extraction and identification of inspection information
    Rui, Wang
    Wang Boxiong
    Luo Xiuzhi
    Chen Huacheng
    Wang Jianmei
    ISTM/2007: 7TH INTERNATIONAL SYMPOSIUM ON TEST AND MEASUREMENT, VOLS 1-7, CONFERENCE PROCEEDINGS, 2007, : 5869 - 5871
  • [29] Understanding Telecom Language Through Large Language Models
    Bariah, Lina
    Zou, Hang
    Zhao, Qiyang
    Mouhouche, Belkacem
    Bader, Faouzi
    Debbah, Merouane
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 6542 - 6547
  • [30] Hybrid Alignment Training for Large Language Models
    Wang, Chenglong
    Zhou, Hang
    Chang, Kaiyan
    Li, Bei
    Mu, Yongyu
    Xiao, Tong
    Liu, Tongran
    Zhu, Jingbo
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 11389 - 11403