An Effective Schema Extraction Algorithm on the Deep Web

被引:0
|
作者
Qiang, Bao-hua [1 ,2 ]
Xi, Jian-qing [1 ]
Qiang, Bao-hua [1 ,2 ]
Zhang, Long [2 ]
机构
[1] South China Univ Technol, Sch Engn & Comp Sci, Guangzhou 510641, Peoples R China
[2] Southwest Univ, Coll Comp & Informat Sci, Chongqing 400715, Peoples R China
关键词
Deep Web; schema extraction algorithm; query interface; grouping patterns;
D O I
暂无
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
The Deep Web, a complex entity that contains information from a variety of source types, has gotten a lot of press in recent years. In order to unlock the vast Deep Web content, effective approaches to extract, index and search the query interfaces from dynamic web pages should be studied carefully. Based on our previously proposed grouping patterns and pre-clustering algorithm, this paper presents an effective schema extraction algorithm. Three metrics-(LCA) precision, (LCA) recall, and (LCA) F1 are employed to evaluate the performance of schema extraction algorithm. The experimental results indicate that our algorithm can improve the performance of schema extraction of query interfaces on the Deep Web obviously and avoid resulting in the inconsistencies between the subsets by pre-clustering algorithm and those by schema extraction algorithm.
引用
收藏
页码:10976 / +
页数:2
相关论文
共 50 条
  • [1] Effective Schema Extraction of Query Interfaces on the Deep Web
    Qiang, Bao-hua
    Xi, Jian-qing
    Qiang, Bao-Hua
    Chen, Ling
    [J]. FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 2, PROCEEDINGS, 2008, : 291 - +
  • [2] An effective method supporting data extraction and schema recognition on deep web
    Liu, Wei
    Shen, Derong
    Nie, Tiezheng
    [J]. PROGRESS IN WWW RESEARCH AND DEVELOPMENT, PROCEEDINGS, 2008, 4976 : 419 - 431
  • [3] Schema Extraction of Deep Web Query Interface
    Wang, Ying
    Peng, Tao
    Zuo, Wanli
    Zhu, Huifeng
    [J]. WISM: 2009 INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND MINING, PROCEEDINGS, 2009, : 391 - 395
  • [4] Discovering the Deep Web through XML Schema Extraction
    Saissi, Yasser
    Zellou, Ahmed
    Idri, Ali
    [J]. KDIR: PROCEEDINGS OF THE 8TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT - VOL. 1, 2016, : 141 - 149
  • [5] Towards XML Schema Extraction from Deep Web
    Saissi, Yasser
    Zellou, Ahmed
    Idri, Ali
    [J]. 2016 4TH IEEE INTERNATIONAL COLLOQUIUM ON INFORMATION SCIENCE AND TECHNOLOGY (CIST), 2016, : 94 - 99
  • [6] DWSpyder: A new schema extraction method for a deep web integration system
    Saissi, Yasser
    Zellou, Ahmed
    Adri, Ali
    [J]. International Journal of Web Engineering and Technology, 2019, 14 (02): : 122 - 150
  • [7] Schema Extraction for Deep Web Query Interfaces Using Heuristics Rules
    Chichang Jou
    [J]. Information Systems Frontiers, 2019, 21 : 163 - 174
  • [9] Heuristics-Based Schema Extraction for Deep Web Query Interfaces
    Jou, Chichang
    Cheng, Yucheng
    [J]. 2017 IEEE 18TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IEEE IRI 2017), 2017, : 389 - 396
  • [10] An approach for deep web interface schema extraction based on hierarchical semantic annotation
    Zhang, Liang
    Lu, Yuliang
    Liu, Jinhong
    Zhang, Tongtong
    [J]. Journal of Information and Computational Science, 2010, 7 (02): : 325 - 332