KPS: a Web information mining algorithm

被引:5
|
作者
Guan, T [1 ]
Wong, KF
机构
[1] Univ Regina, Dept Comp Sci, Regina, SK S4S 0A2, Canada
[2] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Peoples R China
基金
加拿大自然科学与工程研究理事会;
关键词
information extraction; information retrieval; Web query; Web databases;
D O I
10.1016/S1389-1286(99)00048-1
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The Web mostly contains semi-structured information. It is, however, not easy to search and extract structural data hidden in a Web page. Current practices address this problem by (1) syntax analysis (i.e. HTML tags); or (2) wrappers or user-defined declarative languages. The former is only suitable for highly structured Web sites and the latter is time-consuming and offers low scalability. Wrappers could handle tens, but certainly not thousands, of information sources. In this paper, we present a novel information mining algorithm, namely KPS, over semi-structured information on the Web. KPS employs keywords, patterns and/or samples to mine the desired information. Experimental results show that KPS is more efficient than existing Web extracting methods. (C) 1999 Published by Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1495 / 1507
页数:13
相关论文
共 50 条
  • [21] Applications of an web information mining model to data mining and information retrieval tasks
    Pereira, AR
    Baeza-Yates, R
    [J]. SIXTEENTH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2005, : 1031 - 1035
  • [22] Web warehouse - a new web information fusion tool for web mining
    Yu, Lean
    Huang, Wei
    Wang, Shouyang
    Lai, Kin Keung
    [J]. INFORMATION FUSION, 2008, 9 (04) : 501 - 511
  • [23] A Cascade Mining Algorithm Based on Chinese Keywords Web Mining
    Zhou, Xueguang
    Zhang, Huanguo
    [J]. 2008 7TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-23, 2008, : 4080 - +
  • [24] Mining unstructured web pages to enhance web information retrieval
    Yang, Hsin-Chang
    Lee, Chung-Hong
    [J]. ICICIC 2006: FIRST INTERNATIONAL CONFERENCE ON INNOVATIVE COMPUTING, INFORMATION AND CONTROL, VOL 2, PROCEEDINGS, 2006, : 429 - +
  • [25] Web mining: Information and pattern discovery on the World Wide Web
    Cooley, R
    Mobasher, B
    Srivastava, J
    [J]. NINTH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 1997, : 558 - 567
  • [26] Mining interesting topics for Web information gathering and Web personalization
    Li, YF
    Murphy, B
    Zhong, N
    [J]. 2005 IEEE/WIC/ACM International Conference on Web Intelligence, Proceedings, 2005, : 305 - 308
  • [27] 基于KPS的Web信息抽取
    蔡虹
    叶水生
    [J]. 计算机与现代化, 2005, (06) : 4 - 6
  • [28] Ontology based web mining for information gathering
    Li, Yuefeng
    Zhong, Ning
    [J]. WEB INTELLIGENCE MEETS BRAIN INFORMATICS, 2007, 4845 : 406 - +
  • [29] A synthetic intelligent system for web information mining
    Tao, L
    Li, YL
    [J]. PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 1357 - 1360
  • [30] A Web mining model used for information gathering
    Duan Longzhen
    Qin Lei
    [J]. ADVANCED COMPUTER TECHNOLOGY, NEW EDUCATION, PROCEEDINGS, 2007, : 495 - 498