Ontology-based information extraction and integration from heterogeneous data sources

被引:58
|
作者
Buitelaar, Paul [2 ]
Cimiano, Philipp [1 ]
Frank, Anette [3 ]
Hartung, Matthias [3 ]
Racloppa, Stefania [2 ]
机构
[1] Univ Karlsruhe TH, Inst AIFB, D-76131 Karlsruhe, Germany
[2] DFKI GmbH, Language Technol Lab, D-66123 Saarbrucken, Germany
[3] Heidelberg Univ, Seminar Comp Linguist, D-69120 Heidelberg, Germany
关键词
Ontology-based natural language processing; Information extraction; Knowledge integration; Question answering;
D O I
10.1016/j.ijhcs.2008.07.007
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we present the design, implementation and evaluation of SOBA, a system for ontology-based information extraction from heterogeneous data resources, including plain text, tables and image captions. SOBA is capable of processing structured information, text and image captions to extract information and integrate it into a coherent knowledge base. To establish coherence, SOBA interlinks the information extracted from different sources and detects duplicate information. The knowledge base produced by SOBA can then be used to query for information contained in the different sources in an integrated and seamless manner. Overall, this allows for advanced retrieval functionality by which questions can be answered precisely. A further distinguishing feature of the SOBA system is that it straightforwardly integrates deep and shallow natural language processing to increase robustness and accuracy. We discuss the implementation and application of the SOBA system within the SmartWeb multimodal dialog system. In addition, we present a thorough evaluation of the different components of the system. However, an end-to-end evaluation of the whole SmartWeb system is out of the scope of this paper and has been presented elsewhere by the SmartWeb consortium. (C) 2008 Elsevier Ltd. All rights reserved.
引用
收藏
页码:759 / 788
页数:30
相关论文
共 50 条
  • [41] Ontology-based data integration in data logistics workflows
    Cure, Olivier
    Jablonski, Stefan
    [J]. ADVANCES IN CONCEPTUAL MODELING - FOUNDATIONS AND APPLICATIONS, 2007, 4802 : 34 - 43
  • [42] An Ontology-Based Quality Framework for Data Integration
    Wang, Jianing
    Martin, Nigel
    Poulovassilis, Alexandra
    [J]. WORKSHOPS ON BUSINESS INFORMATICS RESEARCH, 2012, 106 : 196 - 208
  • [43] Ontology-based integration of topographic data sets
    Uitermark, HT
    van Oosterom, PJM
    Mars, NJI
    Molenaar, M
    [J]. INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2005, 7 (02): : 97 - 106
  • [44] A Universal Ontology-based Approach to Data Integration
    Olive, Antoni
    [J]. ENTERPRISE MODELLING AND INFORMATION SYSTEMS ARCHITECTURES-AN INTERNATIONAL JOURNAL, 2018, 13 : 110 - 119
  • [45] The PLIB ontology-based approach to data integration
    Pierra, G
    [J]. BUILDING THE INFORMATION SOCIETY, 2004, 156 : 13 - 18
  • [46] Faceted Queries in Ontology-based Data Integration
    Pankowski, Tadeusz
    [J]. PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1 (ICEIS), 2016, : 150 - 157
  • [47] An Ontology-Based Framework for Geographic Data Integration
    Vidal, Vania M. P.
    Sacramento, Eveline R.
    Fernandes de Macedo, Jose Antonio
    Casanova, Marco Antonio
    [J]. ADVANCES IN CONCEPTUAL MODELING - CHALLENGES PERSPECTIVES, 2009, 5833 : 337 - +
  • [48] Ontology-Based Geospatial Data Query and Integration
    Zhao, Tian
    Zhang, Chuanrong
    Wei, Mingzhen
    Peng, Zhong-Ren
    [J]. GEOGRAPHIC INFORMATION SCIENCE, 2008, 5266 : 370 - +
  • [49] A FRAMEWORK FOR ONTOLOGY-BASED HETEROGENEOUS DATA INTEGRATION FOR COST MANAGEMENT IN PRODUCT FAMILY DESIGN
    Chang, Xiaomeng
    Terpenny, Janis
    [J]. DETC 2008: PROCEEDINGS OF THE ASME INTERNATIONAL DESIGN ENGINEERING TECHNICAL CONFERENCES AND COMPUTERS AND INFORMATIONAL IN ENGINEERING CONFERENCE, VOL 3, PTS A AND B: 28TH COMPUTERS AND INFORMATION IN ENGINEERING CONFERENCE, 2009, : 625 - 633
  • [50] Towards a System for Ontology-Based Information Extraction from PDF Documents
    Oro, Ermelinda
    Ruffolo, Massimo
    [J]. ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2008, PT II, PROCEEDINGS, 2008, 5332 : 1482 - 1499