Ontology-based information extraction and integration from heterogeneous data sources

被引:58
|
作者
Buitelaar, Paul [2 ]
Cimiano, Philipp [1 ]
Frank, Anette [3 ]
Hartung, Matthias [3 ]
Racloppa, Stefania [2 ]
机构
[1] Univ Karlsruhe TH, Inst AIFB, D-76131 Karlsruhe, Germany
[2] DFKI GmbH, Language Technol Lab, D-66123 Saarbrucken, Germany
[3] Heidelberg Univ, Seminar Comp Linguist, D-69120 Heidelberg, Germany
关键词
Ontology-based natural language processing; Information extraction; Knowledge integration; Question answering;
D O I
10.1016/j.ijhcs.2008.07.007
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we present the design, implementation and evaluation of SOBA, a system for ontology-based information extraction from heterogeneous data resources, including plain text, tables and image captions. SOBA is capable of processing structured information, text and image captions to extract information and integrate it into a coherent knowledge base. To establish coherence, SOBA interlinks the information extracted from different sources and detects duplicate information. The knowledge base produced by SOBA can then be used to query for information contained in the different sources in an integrated and seamless manner. Overall, this allows for advanced retrieval functionality by which questions can be answered precisely. A further distinguishing feature of the SOBA system is that it straightforwardly integrates deep and shallow natural language processing to increase robustness and accuracy. We discuss the implementation and application of the SOBA system within the SmartWeb multimodal dialog system. In addition, we present a thorough evaluation of the different components of the system. However, an end-to-end evaluation of the whole SmartWeb system is out of the scope of this paper and has been presented elsewhere by the SmartWeb consortium. (C) 2008 Elsevier Ltd. All rights reserved.
引用
收藏
页码:759 / 788
页数:30
相关论文
共 50 条
  • [1] Ontology-based integration of data sources
    Gagnon, Michel
    [J]. 2007 PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION, VOLS 1-4, 2007, : 896 - 903
  • [2] An Ontology-Based Data Integration system for data and multimedia sources
    Beneventano, Domenico
    Orsini, Mirko
    Po, Laura
    Sala, Antonio
    Sorrentino, Serena
    [J]. 2009 IEEE THIRD INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2009), 2009, : 606 - 611
  • [3] Ontology-based Integration of Heterogeneous and Distributed Information of the Marine Domain
    Tzitzikas, Yannis
    Allocca, Carlo
    Bekiari, Chryssoula
    Marketakis, Yannis
    Fafalios, Pavlos
    Minadakis, Nikos
    [J]. ERCIM NEWS, 2014, (96): : 42 - 43
  • [4] Query division and reformullation in ontology-based heterogeneous information integration
    Li Jian
    Jin Beihong
    [J]. CIC 2006: 15TH INTERNATIONAL CONFERENCE ON COMPUTING, PROCEEDINGS, 2006, : 186 - +
  • [5] Source Information Disclosure in Ontology-Based Data Integration
    Benedikt, Michael
    Grau, Bernardo Cuenca
    Kostylev, Egor V.
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 1056 - 1062
  • [6] Study on Ontology-based Semantic Extraction Method for Heterogeneous Data
    Li, Gaihai
    Zhang, Yan
    Yang, Wangli
    Shi, Guiying
    [J]. PROCEEDINGS OF THE FIRST INTERNATIONAL CONFERENCE ON INFORMATION SCIENCES, MACHINERY, MATERIALS AND ENERGY (ICISMME 2015), 2015, 126 : 930 - 934
  • [7] Ontology-based Data Sources' Integration for Maritime Event Recognition
    Santipantakis, Georgios
    Kotis, Konstantinos I.
    Vouros, George A.
    [J]. 2015 6TH INTERNATIONAL CONFERENCE ON INFORMATION, INTELLIGENCE, SYSTEMS AND APPLICATIONS (IISA), 2015,
  • [8] Ontology-based metadata dictionary for integrating Heterogeneous Information Sources on the WWW
    Arch-int, N
    Sophatsathit, P
    Li, YF
    [J]. JOURNAL OF RESEARCH AND PRACTICE IN INFORMATION TECHNOLOGY, 2003, 35 (04): : 285 - 302
  • [9] Query processing the heterogeneous information sources using ontology-based approach
    Arch-int, N
    Li, YY
    Roe, P
    Sophatsathit, P
    [J]. COMPUTERS AND THEIR APPLICATIONS, 2003, : 438 - 441
  • [10] LinkSuiteTM:: Formally robust ontology-based data and information integration
    Ceusters, W
    Smith, B
    Fielding, JM
    [J]. DATA INTEGRATION IN THE LIFE SCIENCES, PROCEEDINGS, 2004, 2994 : 124 - 139