Determining the titles of Web pages using anchor text and link analysis

被引:6
|
作者
Jeong, Ok-Ran [1 ]
Oh, Jehwan [2 ]
Kim, Dong-Jin
Lyu, Heetae [3 ]
Kim, Won [1 ]
机构
[1] Gachon Univ, Dept Software Design & Management, Songnam, South Korea
[2] Univ Minnesota, Dept Comp Sci, Minneapolis, MN 55455 USA
[3] Naver Corp, Songnam, South Korea
基金
新加坡国家研究基金会;
关键词
Anchor text; Link analysis; Title extraction; Web page;
D O I
10.1016/j.eswa.2013.12.033
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Determining the titles of Web pages is an important element in characterizing and categorizing the vast number of Web pages. There are a few approaches to automatically determining the titles of Web pages. As an R&D project for Naver, the operator of Naver (Korea's largest portal site), we developed a new method that makes use of anchor texts and analysis of links among Web pages. In this paper, we describe our method and show experiment results of its performance. (C) 2014 Elsevier Ltd. All rights reserved.
引用
收藏
页码:4322 / 4329
页数:8
相关论文
共 50 条
  • [1] TEXT CONTENT ANALYSIS FOR ILLICIT WEB PAGES BY USING NEURAL NETWORKS
    Sam, Lee Zhi
    Maarof, Mohd Aizaini
    Selamat, Ali
    Shamsuddin, Siti Mariyam
    [J]. JURNAL TEKNOLOGI, 2009, 50
  • [2] A link taxonomy for Web pages
    Haas, SW
    Grams, ES
    [J]. ASIS '98 - PROCEEDINGS OF THE 61ST ASIS ANNUAL MEETING, VOL 35, 1998: INFORMATION ACCESS IN THE GLOBAL INFORMATION ECONOMY, 1998, 35 : 485 - 495
  • [3] A link taxonomy for Web pages
    Haas, SW
    Grams, ES
    [J]. PROCEEDINGS OF THE ASIS ANNUAL MEETING, 1998, 35 : 485 - 495
  • [4] Empirical Analysis of Grouping Web Pages Using Vector Space Model for Link Structures
    Sasaki, Yuichi
    Kurihara, Masahito
    [J]. 2008 IEEE CONFERENCE ON SOFT COMPUTING IN INDUSTRIAL APPLICATIONS SMCIA/08, 2009, : 188 - 193
  • [5] Automatic construction of web directory using hyperlink and anchor text
    Suzuki, Y
    Matsubara, S
    Yoshikawa, M
    [J]. Proceedings of the 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'05), 2005, : 614 - 619
  • [6] Semantic analysis of web pages using web patterns
    Kudelka, Milos
    Snasel, Vaclav
    Lehecka, Ondrej
    E-Qawasmeh, Eyas
    [J]. 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 329 - +
  • [7] Automatic metadata generation for Web pages using a text mining approach
    Yang, HC
    Lee, CH
    [J]. INTERNATIONAL WORKSHOP ON CHALLENGES IN WEB INFORMATION RETRIEVAL AND INTEGRATION, PROCEEDINGS, 2005, : 186 - 194
  • [8] Using Anchor Text Refined by Page Importance to Improve Web Retrieval
    Zhang, Yonggang
    Lei, Kai
    Huang, Lian'en
    [J]. PROCEEDINGS OF 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, VOLS I-VI, 2012, : 1200 - 1203
  • [9] Web pages classification using concept analysis
    Di Lucca, Giuseppe Antonio
    Fasolino, Anna Rita
    Tramontana, Porfirio
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, 2007, : 435 - +
  • [10] Study on the pretreatment of web pages based on web text classification
    Li, Runzhi
    Zhang, Yangsen
    [J]. 11TH CHINESE LEXICAL SEMANTICS WORKSHOP (CKSW2010), 2010, : 356 - 360