Rule-based information extraction for mechanical-electrical-plumbing-specific semantic web

被引:30
|
作者
Wu, Lang-Tao [1 ]
Lin, Jia-Rui [1 ]
Leng, Shuo [1 ]
Li, Jiu-Lin [2 ]
Hu, Zhen-Zhong [3 ]
机构
[1] Tsinghua Univ, Dept Civil Engn, Beijing 100084, Peoples R China
[2] Beijing Urban Construction Grp Co Ltd, Beijing 100088, Peoples R China
[3] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Information extraction; MEP; Rule match; Named entity recognition; Relation extraction; Natural language understanding; Semantic web; MANAGEMENT; KNOWLEDGE; ONTOLOGY; OBJECTS;
D O I
10.1016/j.autcon.2021.104108
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Information extraction (IE), which aims to retrieve meaningful information from plain text, has been widely studied in general and professional domains to support downstream applications. However, due to the lack of labeled data and the complexity of professional mechanical, electrical and plumbing (MEP) information, it is challenging to apply current common deep learning IE methods to the MEP domain. To solve this problem, this paper proposes a rule-based approach for MEP IE task, including a "snowball " strategy to collect large-scale MEP corpora, a suffix-based matching algorithm on text segments for named entity recognition (NER), and a dependency-path-based matching algorithm on dependency tree for relationship extraction (RE). 2 ideas called "meta linking " and "path filtering " for RE are proposed as well, to discover the out-of-pattern entities/relationships as many as possible. To verify the feasibility of the proposed approach, 65 MB MEP corpora have been collected as input of the proposed approach and an MEP semantic web which consists of 15,978 entities and 65,110 relationship triples established, with an accuracy of 81% to entities and 75% to relationship triples, respectively. A comparison experiment between classical deep learning models and the proposed rule-based approach was carried out, illustrating that the performance of our method is 37% and 49% better than the selected deep learning NER and RE models, respectively, in the aspect of extraction precision.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Rule-based intelligence in the semantic web or "I'll settle for a web that's just not so dumb!"
    Allemang, Dean
    RuleML 2006: Second International Conference on Rules and Rule Markup Languages for the Semantic Web, Proceedings, 2006, : 83 - 85
  • [42] Personalization of Rule-based Web Services
    Choi, Okkyung
    Han, SangYong
    SENSORS, 2008, 8 (04): : 2424 - 2435
  • [43] Rule-based verification of Web sites
    M. Alpuente
    D. Ballis
    M. Falaschi
    International Journal on Software Tools for Technology Transfer, 2006, 8 (6) : 565 - 585
  • [44] Rule-based web service validation
    Kalman, Miklos
    2014 IEEE 21ST INTERNATIONAL CONFERENCE ON WEB SERVICES (ICWS 2014), 2014, : 542 - 549
  • [45] FLORA-2:: A rule-based knowledge representation and inference infrastructure for the Semantic Web
    Yang, GZ
    Kifer, M
    Zhao, C
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2003: COOPIS, DOA, AND ODBASE, 2003, 2888 : 671 - 688
  • [46] Comparison of the Performance of Drools and Jena Rule-Based Systems for Event Processing on the Semantic Web
    Fobel, Andrew
    Subramanian, Nary
    2016 IEEE/ACIS 14TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING RESEARCH, MANAGEMENT AND APPLICATIONS (SERA), 2016, : 249 - 255
  • [47] An approach to rule-based knowledge extraction
    Jin, YC
    von Seelen, W
    Sendhoff, B
    1998 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AT THE IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE - PROCEEDINGS, VOL 1-2, 1998, : 1188 - 1193
  • [48] Research on web character information extraction based on semantic similarity
    Wang, Bao-Cheng
    Huang, Wei
    Li, Zhong-Ren
    Xiao, Ke
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONIC INFORMATION ENGINEERING (CEIE 2016), 2016, 116 : 663 - 670
  • [49] Automatic Semantic Content Extraction in Videos Using a Fuzzy Ontology and Rule-Based Model
    Yildirim, Yakup
    Yazici, Adnan
    Yilmaz, Turgay
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (01) : 47 - 61
  • [50] A rule-based approach for semantic annotation evolution
    Luong, P.-H.
    Dieng-Kuntz, R.
    COMPUTATIONAL INTELLIGENCE, 2007, 23 (03) : 320 - 338