On the automatic extraction of data from the hidden web

被引:0
|
作者
Liddle, SW [1 ]
Yau, SH
Embley, DW
机构
[1] Brigham Young Univ, Informat Syst Grp, Provo, UT 84602 USA
[2] Brigham Young Univ, Dept Comp Sci, Provo, UT 84602 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An increasing amount of Web data is accessible only by filling out HTML forms to query an underlying data source. While this is most welcome from a user perspective (queries are easy and precise) and from a data management perspective (static pages need not be maintained; databases can be accessed directly), automated agents have greater difficulty accessing data behind forms. In this paper we present a method for automatically filling in forms to retrieve the associated dynamically generated pages. Using our approach automated agents can begin to systematically access portions of the "hidden Web."
引用
收藏
页码:212 / 226
页数:15
相关论文
共 50 条
  • [1] Automatic generation of agents for collecting hidden Web pages for data extraction
    Lage, JP
    da Silva, AS
    Golgher, PB
    Laender, AHF
    [J]. DATA & KNOWLEDGE ENGINEERING, 2004, 49 (02) : 177 - 196
  • [2] Automatic generation of wrapper for data extraction from the Web
    Zhang, SZ
    Lu, ZD
    [J]. WEB ENGINEERING, PROCEEDINGS, 2003, 2722 : 390 - 394
  • [3] Automatic Data Extraction from Web Discussion Forums
    Li, Suke
    Tang, Liyong
    Hu, Jianbin
    Chen, Zhong
    [J]. FCST 2009: PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON FRONTIER OF COMPUTER SCIENCE AND TECHNOLOGY, 2009, : 219 - 225
  • [4] Automatic data extraction from data-rich web pages
    Hu, DD
    Meng, XF
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2005, 3453 : 828 - 839
  • [5] Automatic Extraction of Complex Web Data
    Zhang, Ming
    Zhou, Ying
    Patrick, Jon
    [J]. PACIFIC ASIA CONFERENCE ON INFORMATION SYSTEMS 2006, SECTIONS 1-8, 2006, : 1436 - 1449
  • [6] Automatic data extraction from template generated web pages
    Ma, L
    Goharian, N
    Chowdhury, A
    [J]. PDPTA'03: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS 1-4, 2003, : 642 - 648
  • [7] Semantic Deep Web: Automatic Attribute Extraction from the Deep Web Data Sources
    An, Yoo Jung
    Geller, James
    Wu, Yi-Ta
    Chun, Soon Ae
    [J]. APPLIED COMPUTING 2007, VOL 1 AND 2, 2007, : 1667 - 1672
  • [8] The Research of automatic extraction dynamic web data
    Qu Jubao
    [J]. 2009 INTERNATIONAL FORUM ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 2, PROCEEDINGS, 2009, : 143 - 146
  • [9] Wrapper generation for automatic data extraction from large web sites
    Jindal, N
    [J]. DATABASES IN NETWORKED INFORMATION SYSTEMS, PROCEEDINGS, 2005, 3433 : 34 - 53
  • [10] Automatic Data Extraction from Lists in Web Pages Based on XML
    Xin, Zhou
    Hao, Wang
    [J]. ADVANCED TECHNOLOGY IN TEACHING - PROCEEDINGS OF THE 2009 3RD INTERNATIONAL CONFERENCE ON TEACHING AND COMPUTATIONAL SCIENCE (WTCS 2009), VOL 2: EDUCATION, PSYCHOLOGY AND COMPUTER SCIENCE, 2012, 117 : 915 - 921