Extracting Structure of Web Site Based on Hyperlink Analysis

被引:0
|
作者
Li, Feng [1 ]
机构
[1] S China Univ Technol, Sch Business Adm, Guangzhou, Guangdong, Peoples R China
关键词
Web Site Structure; Web Mining; Hyperlink Analysis;
D O I
暂无
中图分类号
TN [电子技术、通信技术];
学科分类号
0809 ;
摘要
Structure of a Web site usually reflects the implicit logical relationship among Web pages, and is widely applied to Web mining and Web information retrieval. however, it is difficult for machine to extract structure of a Web site automatically out of varied noise hyperlinks. This paper proposes an algorithm to extract the structure of a Web site automatically based on hyperlink analysis. The algorithm identifies and filters noise hyperlinks by patterns of Web pages these hyperlinks connected, instead of patterns of the hyperlinks. It promises better performances than previous approaches. The preliminary results show that the proposed algorithm has a great improvement on both precision and recall ratio.
引用
收藏
页码:10919 / 10922
页数:4
相关论文
共 50 条
  • [1] Web Template Extraction Based on Hyperlink Analysis
    Alarte, Julian
    Insa, David
    Silva, Josep
    Tamarit, Salvador
    [J]. ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2015, (173): : 16 - 26
  • [2] Hyperlink Analysis for the Web
    Henzigner, MR
    [J]. IEEE INTERNET COMPUTING, 2001, 5 (01) : 45 - 50
  • [3] Harnessing the hyperlink structure of the Web
    Beg, MMS
    Ahmad, N
    [J]. IETE TECHNICAL REVIEW, 2001, 18 (04) : 337 - 342
  • [4] Harnessing the hyperlink structure of the Web
    Sufyan Beg, M.M.
    Ahmad, Nesar
    [J]. IETE Technical Review (Institution of Electronics and Telecommunication Engineers, India), 2001, 18 (04): : 337 - 342
  • [5] KAGAMI: Web rating agent based on hyperlink structure
    Otsuka, N
    Hiraishi, H
    Mizoguchi, F
    [J]. JOINT 9TH IFSA WORLD CONGRESS AND 20TH NAFIPS INTERNATIONAL CONFERENCE, PROCEEDINGS, VOLS. 1-5, 2001, : 2659 - 2664
  • [6] Site-granularity topic distillation on the web by combining content and hyperlink analysis
    Xu, ZM
    Ca, X
    Han, YH
    Qu, YZ
    Dong, YS
    [J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 2116 - 2121
  • [7] An Efficient Web Search Algorithm based on Differential Evolution and Hyperlink Structure
    Yan, Lili
    Du, Wencai
    Wei, Yingbin
    Huang, Hao
    [J]. 2011 7TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING (WICOM), 2011,
  • [8] An investigation of the TREC web track datasets based on the hyperlink analysis algorithm
    Liu, Y
    Zhang, G
    [J]. 2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 464 - 469
  • [9] Using Anchor Texts with Their Hyperlink Structure for Web Search
    Dou, Zhicheng
    Song, Ruihua
    Nie, Jian-Yun
    Wen, Ji-Rong
    [J]. PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 227 - 234
  • [10] A Heuristic Mining Algorithm Using Web Hyperlink Structure
    Chai, Chunlai
    [J]. PROGRESS IN MEASUREMENT AND TESTING, PTS 1 AND 2, 2010, 108-111 : 11 - 16