Discovering the Deep Web through XML Schema Extraction

被引:0
|
作者
Saissi, Yasser [1 ]
Zellou, Ahmed [1 ]
Idri, Ali [1 ]
机构
[1] Mohammed V Univ Rabat, Rabat, Morocco
关键词
Deep Web; Schema Extraction; Web Integration; DATABASES;
D O I
10.5220/0006013901410149
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The web accessible by the search engines contains a vast amount of information. However, there is another part of the web called the deep web accessible only through its associated HTML forms, and containing much more information. The integration of the deep web content presents many challenges that are not fully addressed by the actual deep web access approaches. The integration of the deep web data requires knowing the schema describing each deep web source. This paper presents our approach to extract the XML schema describing a selected deep web source. The XML schema extracted will be used to integrate the associated deep web source into a mediation system. The principle of our approach is to apply a static and a dynamic analysis to the HTML forms giving access to the selected deep web source. We describe the algorithms of our approach and compare it to the other existing approaches.
引用
收藏
页码:141 / 149
页数:9
相关论文
共 50 条
  • [1] Towards XML Schema Extraction from Deep Web
    Saissi, Yasser
    Zellou, Ahmed
    Idri, Ali
    [J]. 2016 4TH IEEE INTERNATIONAL COLLOQUIUM ON INFORMATION SCIENCE AND TECHNOLOGY (CIST), 2016, : 94 - 99
  • [2] An Effective Schema Extraction Algorithm on the Deep Web
    Qiang, Bao-hua
    Xi, Jian-qing
    Qiang, Bao-hua
    Zhang, Long
    [J]. 2008 4TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-31, 2008, : 10976 - +
  • [3] Schema Extraction of Deep Web Query Interface
    Wang, Ying
    Peng, Tao
    Zuo, Wanli
    Zhu, Huifeng
    [J]. WISM: 2009 INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, : 391 - 395
  • [4] Effective Schema Extraction of Query Interfaces on the Deep Web
    Qiang, Bao-hua
    Xi, Jian-qing
    Qiang, Bao-Hua
    Chen, Ling
    [J]. FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 291 - +
  • [5] 基于XML Schema的Deep Web查询接口分类研究
    苟和平
    景永霞
    吴多智
    [J]. 长春大学学报, 2016, 26 (04) : 13 - 18
  • [6] Schema extraction and levelization for XML data
    Yoon, JP
    Kim, SR
    [J]. DATA MINING AND KNOWLEDGE DISCOVERY: THEORY, TOOLS AND TECHNOLOGY III, 2001, 4384 : 116 - 125
  • [7] DWSpyder: A new schema extraction method for a deep web integration system
    Saissi, Yasser
    Zellou, Ahmed
    Adri, Ali
    [J]. International Journal of Web Engineering and Technology, 2019, 14 (02): : 122 - 150
  • [8] Schema Extraction for Deep Web Query Interfaces Using Heuristics Rules
    Chichang Jou
    [J]. Information Systems Frontiers, 2019, 21 : 163 - 174
  • [9] An effective method supporting data extraction and schema recognition on deep web
    Liu, Wei
    Shen, Derong
    Nie, Tiezheng
    [J]. PROGRESS IN WWW RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2008, 4976 : 419 - 431