A Methodology for Extracting Knowledge about Controlled Vocabularies from Textual Data using FCA-Based Ontology Engineering

被引:0
|
作者
Jabbari, Simin [1 ,2 ]
Stoffel, Kilian [1 ]
机构
[1] Univ Neuchatel, Informat Management Inst, CH-2000 Neuchatel, Switzerland
[2] F Hoffmann La Roche Ltd, Diagnost Data Sci Lab, CH-4070 Basel, Switzerland
关键词
Semantic knowledge extraction; Ontology learning; Controlled vocabulary; Formal Concept Analysis; Natural Language Processing;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We introduce an end-to-end methodology (from text processing to querying a knowledge graph) for the sake of knowledge extraction from text corpora with a focus on a list of vocabularies of interest. We propose a pipeline that incorporates Natural Language Processing (NLP), Formal Concept Analysis (FCA), and Ontology Engineering techniques to build an ontology from textual data. We then extract the knowledge about controlled vocabularies by querying that knowledge graph, i.e., the engineered ontology. We demonstrate the significance of the proposed methodology by using it for knowledge extraction from a text corpus that consists of 800 news articles and reports about companies and products in the IT and pharmaceutical domain, where the focus is on a given list of 250 controlled vocabularies.
引用
收藏
页码:1657 / 1661
页数:5
相关论文
共 7 条
  • [1] FCA-Based Ontology Learning from Unstructured Textual Data
    Jabbari, Simin
    Stoffel, Kilian
    [J]. MINING INTELLIGENCE AND KNOWLEDGE EXPLORATION, MIKE 2018, 2018, 11308 : 1 - 10
  • [2] Ontology-Based Data Access for Extracting Event Logs from Legacy Data: The onprom Tool and Methodology
    Calvanese, Diego
    Kalayci, Tahir Emre
    Montali, Marco
    Tinella, Stefano
    [J]. BUSINESS INFORMATION SYSTEMS (BIS 2017), 2017, 288 : 220 - 236
  • [3] Extracting knowledge from text using SHELDON, a Semantic Holistic framEwork for LinkeD ONtology data
    Recupero, Diego Reforgiato
    Nuzzolese, Andrea G.
    Consoli, Sergio
    Presutti, Valentina
    Peroni, Silvio
    Mongiov, Misael
    [J]. WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 235 - 238
  • [4] Using XML and controlled vocabularies to achieve unambiguous knowledge acquisition from multiple heterogeneous medical data sources
    Kontaxis, KM
    Sakellaris, GC
    Fotiadis, DI
    [J]. ITAB 2003: 4TH INTERNATIONAL IEEE EMBS SPECIAL TOPIC CONFERENCE ON INFORMATION TECHNOLOGY APPLICATIONS IN BIOMEDICINE, CONFERENCE PROCEEDINGS: NEW SOLUTIONS FOR NEW CHALLENGES, 2003, : 161 - 164
  • [5] Extracting Entities of Emergent Events from Social Streams Based on a Data-Cluster Slicing Approach for Ontology Engineering
    Lee, Chung-Hong
    Wu, Chih-Hung
    [J]. INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2015, 5 (03) : 1 - 18
  • [6] Extracting personalised ontology from data-intensive web application: an HTML']HTML forms-based reverse engineering approach
    Benslimane, Sidi Mohamed
    Malki, Mimoun
    Rahmouni, Mustapha Kamal
    Benslimane, Djamal
    [J]. INFORMATICA, 2007, 18 (04) : 511 - 534
  • [7] Extracting Facilitators of New Businesses Within an Existing Company Using Textual Data Generated from Structured Descriptive Questions Based on a Two-dimensional Framework
    Taki, Shoi
    Dan, Ippeita
    Minami, Yuko
    Handa, Toru
    Kyutoku, Yasushi
    [J]. INTERNATIONAL JOURNAL OF AFFECTIVE ENGINEERING, 2024, 23 (02): : 143 - 155