Extracting Room Prices from Web Tables - an Ontology-Aware Approach

被引:0
|
作者
Buttinger, Christina [1 ]
Feilmayr, Christina [1 ]
Guttenbrunner, Michael [1 ]
Parzer, Stefan [1 ]
Proell, Birgit [1 ]
机构
[1] Johannes Kepler Univ Linz, Inst Applicat Oriented Knowledge Proc, A-4040 Linz, Austria
关键词
Ontology-based Information Extraction; Table Information Extraction; Price Table Pattern; Tourism Price Ontology; Ontology-aware Price Annotation;
D O I
10.1007/978-3-211-99407-8_19
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The growing amount of semi-structured and unstructured data on tourism Web sites with heterogeneous designs requires information extraction (IE) mechanisms, to create, for instance, tourism portals. In order to build semantic eTourism environments, the acquisition of room prices is of particular interest. Room prices and related information often appear in tabular structures, which still challenge Web information extraction techniques. In this paper, we begin by identifying various price table patterns which are characterized by the position of a number of features that determine a room price. We then describe an extended ontology model for tourism prices. Finally, we present TAINEX, a plug-in for functional and structural analysis and data interpretation of price tables, which extends the existing prototype TourIE, a rule-/ontology-based information extraction system for Web sites with heterogeneous designs.
引用
收藏
页码:223 / 234
页数:12
相关论文
共 50 条
  • [1] TurtleEditor: An Ontology-Aware Web-Editor for Collaborative Ontology Development
    Petersen, Niklas
    Coskun, Goekhan
    Lange, Christoph
    [J]. 2016 IEEE TENTH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2016, : 182 - 185
  • [2] Ontology-aware prediction from rules: A reconciliation-based approach
    Sais, Fatiha
    Thomopoulos, Rallou
    [J]. KNOWLEDGE-BASED SYSTEMS, 2014, 67 : 117 - 130
  • [3] Hybrid approach to extracting information from web-tables
    Jung, Sung-won
    Kang, Mi-young
    Kwon, Hyuk-chul
    [J]. COMPUTER PROCESSING OF ORIENTAL LANGUAGES, PROCEEDINGS: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD, 2006, 4285 : 109 - +
  • [4] From Linguistic Resources to Ontology-Aware Terminologies: Minding the Representation Gap
    Speranza, Giulia
    di Buono, Maria Pia
    Monti, Johanna
    Sangati, Federico
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 2503 - 2510
  • [5] A scalable hybrid approach for extracting head components from Web tables
    Jung, SW
    Kwon, HC
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (02) : 174 - 187
  • [6] An iterative approach to build relevant ontology-aware data-driven models
    Thomopoulos, Rallou
    Destercke, Sebastien
    Charnomordic, Brigitte
    Johnson, Iyan
    Abecassis, Joel
    [J]. INFORMATION SCIENCES, 2013, 221 : 452 - 472
  • [7] Ontology extraction from tables on the web
    Tanaka, M
    Ishida, T
    [J]. INTERNATIONAL SYMPOSIUM ON APPLICATIONS AND THE INTERNET , PROCEEDINGS, 2006, : 284 - +
  • [8] Ontology-aware neural network: a general framework for pattern mining from microbiome data
    Zha, Yuguo
    Ning, Kang
    [J]. BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)
  • [9] Extracting Contextualized Quantity Facts from Web Tables
    Ho, Vinh Thinh
    Pal, Koninika
    Razniewski, Simon
    Berberich, Klaus
    Weikum, Gerhard
    [J]. PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 4033 - 4042
  • [10] MedTable: Extracting Disease Types from Web Tables
    Koutraki, Maria
    Fetahu, Besnik
    [J]. SEMANTIC WEB: ESWC 2020 SATELLITE EVENTS, 2020, 12124 : 152 - 157