Extraction of relational schema from deep web sources: a form driven approach

被引:0
|
作者
Saissi, Yasser [1 ]
Zellou, Ahmed [1 ]
Idri, Ali [1 ]
机构
[1] Mohammed V Univ, ENSIAS, Rabat, Morocco
关键词
Deep web source; Web source integration; Structured data; !text type='HTML']HTML[!/text] form; DATABASES;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The deep web is the biggest unexplored part of the web and we need to access directly to its entire data web sources without using any crawling or surfacing method. For this, we choose to use a virtual web integration system. However, the deep web virtual integration methods existing today, focuses only on the integration of the query interfaces giving access to the deep web. These query interfaces are integrated to build a global query interface able to query all the deep web sources. The objective of our work is to propose another vision of a deep web virtual integration system that uses a mediated schema built with a relational schema describing each deep web source. This paper proposes our approach to extract a relational schema describing a deep web source. The key idea underlying our approach is to analyze two structured information: the HTML Form and the HTML Table extracted from the deep web source to discover its data structure and to allow us to build a relational schema describing it. We use also a knowledge table to take profit of our learning experience on extracting relational schema from deep web source.
引用
收藏
页码:178 / 182
页数:5
相关论文
共 50 条
  • [31] Formal concept analysis approach for data extraction from a limited deep web database
    Zhang, Zhuo
    Du, Juan
    Wang, Liming
    [J]. JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2013, 41 (02) : 211 - 234
  • [32] Extraction techniques for mining services from web sources
    Davulcu, H
    Mukherjee, S
    Ramakrishnan, IV
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2002, : 601 - 604
  • [33] Ontology-Driven Extraction of Event Logs from Relational Databases
    Calvanese, Diego
    Montali, Marco
    Syamsiyah, Alifah
    van der Aalst, Wil M. P.
    [J]. BUSINESS PROCESS MANAGEMENT WORKSHOPS, (BPM 2015), 2016, 256 : 140 - 153
  • [34] A Model-Driven Heuristic Approach for Detecting Multidimensional Facts in Relational Data Sources
    Carme, Andrea
    Norberto Mazon, Jose
    Rizzi, Stefano
    [J]. DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, 2010, 6263 : 13 - 24
  • [35] Extraction, modeling, and predicting: a web driven approach for Taiwan stock prediction
    Tseng, Chun-Hsiung
    Chen, Yung-Hui
    Huang, Ching-Lien
    Jiang, Yan-Ru
    Lin, Jia-Rou
    [J]. JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2018, 41 (08) : 651 - 659
  • [36] L-wrappers: concepts, properties and construction - A declarative approach to data extraction from web sources
    Badica, Costin
    Badica, Amelia
    Popescu, Elvira
    Abraham, Ajith
    [J]. SOFT COMPUTING, 2007, 11 (08) : 753 - 772
  • [37] L-wrappers: concepts, properties and constructionA declarative approach to data extraction from web sources
    Costin Bădică
    Amelia Bădică
    Elvira Popescu
    Ajith Abraham
    [J]. Soft Computing, 2007, 11 : 753 - 772
  • [38] QFL for the web data extraction from multiple data sources
    Borle, Shivani W.
    Potgantwar, A. D.
    [J]. 1ST INTERNATIONAL CONFERENCE ON COMPUTING COMMUNICATION CONTROL AND AUTOMATION ICCUBEA 2015, 2015, : 432 - 436
  • [39] Profile generation from web sources: an information extraction system
    Ranjan, Rishabh
    Vathsala, H.
    Koolagudi, Shashidhar G.
    [J]. SOCIAL NETWORK ANALYSIS AND MINING, 2022, 12 (01)
  • [40] Profile generation from web sources: an information extraction system
    Rishabh Ranjan
    H. Vathsala
    Shashidhar G. Koolagudi
    [J]. Social Network Analysis and Mining, 2022, 12