Mining unstructured web pages to enhance web information retrieval

被引:0
|
作者
Yang, Hsin-Chang [1 ]
Lee, Chung-Hong [2 ]
机构
[1] Chang Jung Univ, Dept Informat Management, Tainan, Taiwan
[2] Natl Kaohsiung Univ Appl Sci, Dept Elect Engn, Kaohsiung, Taiwan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One major approach for information finding in the WWW is to navigate through some web directories and browse them for the goal pages. However, such directories are generally constructed manually and have disadvantages of narrow coverage and inconsistency. In this work, we propose a machine learning approach that automatically constructs a navigational structure for the WWW to help information finding. A self-organizing map is constructed to train the web pages and obtain two feature maps, which reveal the relationships among web pages and thematic keywords respectively. We then use these maps to develop a structure that may assist the users finding the information they need. We used a small set of web pages in the experiments and obtained promising result.
引用
收藏
页码:429 / +
页数:2
相关论文
共 50 条
  • [1] Geographic Information Retrieval and Text Mining on Chinese Tourism Web Pages
    Tsou, Ming-Cheng
    [J]. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2010, 5 (01) : 56 - 75
  • [2] Mining key information of web pages
    Wang, C
    Lu, J
    Zhang, GQ
    [J]. PROCEEDINGS OF THE 8TH JOINT CONFERENCE ON INFORMATION SCIENCES, VOLS 1-3, 2005, : 1573 - 1576
  • [3] Web pages clustering and concepts mining: An approach towards intelligent information retrieval
    Li, Fang
    Mehlitz, Martin
    Fen, Li
    Sheng, Huanye
    [J]. 2006 IEEE CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEMS, VOLS 1 AND 2, 2006, : 522 - +
  • [4] Fast Information Retrieval from Web Pages
    El-Bakry, Hazem M.
    Mastorakis, Nikos
    [J]. PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, MAN-MACHINE SYSTEMS AND CYBERNETICS (CIMMACS '08): RECENT ADVANCES IN COMPUTATIONAL INTELLIGENCE, MAN-MACHINE SYSTEMS AND CYBERNETICS, 2008, : 229 - +
  • [5] Information Retrieval Based on Image Detection on Web Pages
    El-Bakry, Hazem M.
    Mastorakis, Nikos
    [J]. PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE, MAN-MACHINE SYSTEMS AND CYBERNETICS (CIMMACS '08): RECENT ADVANCES IN COMPUTATIONAL INTELLIGENCE, MAN-MACHINE SYSTEMS AND CYBERNETICS, 2008, : 221 - +
  • [6] Mining environmental texts of images in web pages for image retrieval
    Yang, HC
    Lee, CH
    [J]. FOUNDATIONS OF INTELLIGENT SYSTEMS, 2003, 2871 : 334 - 338
  • [7] Web mining for web image retrieval
    Chen, Zheng
    Wenyin, Liu
    Zhang, Feng
    Li, Mingjing
    Zhang, Hongjiang
    [J]. 2001, John Wiley and Sons Inc. (52):
  • [8] Web mining for Web image retrieval
    Zheng, C
    Liu, WY
    Feng, Z
    Li, MJ
    Zhang, HJ
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2001, 52 (10): : 831 - 839
  • [9] Term frequency occurrences on web pages for textual information retrieval
    Sivapathasundaram, Karthika
    Cheng, Xiaochun
    Petridis, Miltos
    [J]. DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 585 - 590
  • [10] Using web mining to sample the contents of an information ecosystem to locate and retrieve web pages
    Walker, RL
    [J]. Proceedings of the ISCA 20th International Conference on Computers and Their Applications, 2005, : 247 - 252