KPS: a Web information mining algorithm

被引:5
|
作者
Guan, T [1 ]
Wong, KF
机构
[1] Univ Regina, Dept Comp Sci, Regina, SK S4S 0A2, Canada
[2] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China
基金
加拿大自然科学与工程研究理事会;
关键词
information extraction; information retrieval; Web query; Web databases;
D O I
10.1016/S1389-1286(99)00048-1
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Web mostly contains semi-structured information. It is, however, not easy to search and extract structural data hidden in a Web page. Current practices address this problem by (1) syntax analysis (i.e. HTML tags); or (2) wrappers or user-defined declarative languages. The former is only suitable for highly structured Web sites and the latter is time-consuming and offers low scalability. Wrappers could handle tens, but certainly not thousands, of information sources. In this paper, we present a novel information mining algorithm, namely KPS, over semi-structured information on the Web. KPS employs keywords, patterns and/or samples to mine the desired information. Experimental results show that KPS is more efficient than existing Web extracting methods. (C) 1999 Published by Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1495 / 1507
页数:13
相关论文
共 50 条
  • [1] KPS: a Web information mining algorithm
    Guan, T
    Won, KF
    [J]. PROCEEDINGS OF THE EIGHTH INTERNATIONAL WORLD WIDE WEB CONFERENCE, 1999, : 417 - 429
  • [2] Research of Web information extraction MAS model based on KPS
    Duan Longzhen
    Qian Jun
    Huang Shuiyuan
    Yu Jing
    Zhang Hejiang
    [J]. ADVANCED COMPUTER TECHNOLOGY, NEW EDUCATION, PROCEEDINGS, 2007, : 520 - 524
  • [3] A Hybrid Information Filtering Algorithm Based on Distributed Web log Mining
    Ling Yun
    Wang Xun
    Gu Huamao
    [J]. THIRD 2008 INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, VOL 1, PROCEEDINGS, 2008, : 1086 - 1091
  • [4] A Fireworks Algorithm for Modern Web Information Retrieval with Visual Results Mining
    Bouarara, Hadj Ahmed
    Hamou, Reda Mohamed
    Amine, Abdelmalek
    Rahmani, Amine
    [J]. INTERNATIONAL JOURNAL OF SWARM INTELLIGENCE RESEARCH, 2015, 6 (03) : 1 - 23
  • [5] An Efficient Algorithm of Association Information Mining on Web Pages with Dynamic Scripts
    Tan, Tao
    Tan, Leting
    [J]. EMERGING RESEARCH IN WEB INFORMATION SYSTEMS AND MINING, 2011, 238 : 334 - 342
  • [6] A web usage mining algorithm for web personalization
    Picariello, Antonio
    Sansone, Carlo
    [J]. INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2008, 2 (04): : 219 - 230
  • [7] Incremental FP_Growth Mining Algorithm Based on Web Information Extraction
    Chen Hong-ye
    Jin Guo-ying
    [J]. ICIC 2009: SECOND INTERNATIONAL CONFERENCE ON INFORMATION AND COMPUTING SCIENCE, VOL 1, PROCEEDINGS: COMPUTING SCIENCE AND ITS APPLICATION, 2009, : 91 - 93
  • [8] Study on Web Mining Algorithm Based on Usage Mining
    Han, Qingtian
    Gao, Xiaoyan
    Wu, Wenguo
    [J]. 9TH INTERNATIONAL CONFERENCE ON COMPUTER-AIDED INDUSTRIAL DESIGN & CONCEPTUAL DESIGN, VOLS 1 AND 2: MULTICULTURAL CREATION AND DESIGN - CAID& CD 2008, 2008, : 1121 - +
  • [9] A new web information fusion tool for web mining
    Chen, Zuyi
    Zhao, Taixiang
    [J]. PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON MECHATRONICS, MATERIALS, CHEMISTRY AND COMPUTER ENGINEERING 2015 (ICMMCCE 2015), 2015, 39 : 1120 - 1123
  • [10] WIM: an information mining model for the Web
    Baeza-Yates, R
    Pereira, AR
    Ziviani, N
    [J]. SIXTEENTH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2005, : 1155 - 1159