A Probabilistic Approach for Distillation and Ranking of Web Pages

被引:2
|
作者
Greco G. [1 ]
Greco S. [1 ]
Zumpano E. [1 ]
机构
[1] DEIS, Università della Calabria, Rende
关键词
information retrieval on the Web; random walks; search engines; Web searching;
D O I
10.1023/A:1013883717655
中图分类号
学科分类号
摘要
A great number of recent papers have investigated the possibility of introducing more effective and efficient algorithms for search engines. In traditional search engines the resulting ranking is carried out using textual information only and, as showed by several works, they are not very useful for extracting relevant information. Present research, instead, takes a new approach, called Topic Distillation, whose main task is finding relevant documents using a different similarity criterion: retrieved documents are those related to the query topic, but which do not necessarily contain the query string. Current algorithms for topic distillation first compute a base set containing all the relevant pages and then, by applying an iterative procedure, obtain the authoritative pages. In this paper, we present a different approach which computes the authoritative pages by analyzing the structure of the base set. The technique applies a statistical approach to the co-citation matrix (of the base set) to find the most co-cited pages and combines a link analysis approach with the content page evaluation. Several experiments have shown the validity of our approach. © 2001, Kluwer Academic Publishers.
引用
收藏
页码:189 / 207
页数:18
相关论文
共 50 条
  • [21] An Empirical Investigation of PageRank and Its Variants in Ranking Pages on the Web
    Ali, Fayyaz
    Ullah, Irfan
    Khusro, Shah
    [J]. PROCEEDINGS OF 14TH INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY PROCEEDINGS - FIT 2016, 2016, : 354 - 359
  • [22] Improved link-based algorithms for ranking web pages
    Wang, ZY
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 291 - 302
  • [23] A unified approach to ranking in probabilistic databases
    Li, Jian
    Saha, Barna
    Deshpande, Amol
    [J]. VLDB JOURNAL, 2011, 20 (02): : 249 - 275
  • [24] A Unified Approach to Ranking in Probabilistic Databases
    Li, Jian
    Saha, Barna
    Deshpande, Amol
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2009, 2 (01): : 502 - 513
  • [25] A unified approach to ranking in probabilistic databases
    Jian Li
    Barna Saha
    Amol Deshpande
    [J]. The VLDB Journal, 2011, 20 : 249 - 275
  • [26] An Approach to Assess the Quality of Web Pages in the Deep Web
    Nie, Tiezheng
    Yu, Ge
    Shen, Derong
    Kou, Yue
    Yue, Dejun
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2011, 2011, 6637 : 514 - 525
  • [27] A rendering approach for stereoscopic web pages
    Zhang, Jianlong
    Wang, Wenmin
    Wang, Ronggang
    Chen, Qinshui
    [J]. STEREOSCOPIC DISPLAYS AND APPLICATIONS XXV, 2014, 9011
  • [28] A novel web ranking algorithm based on pages multi-attribute
    Baker M.R.
    Akcayol M.A.
    [J]. International Journal of Information Technology, 2022, 14 (2) : 739 - 749
  • [29] An approach to identify duplicated Web pages
    Di Lucca, GA
    Di Penta, M
    Fasolino, AR
    [J]. 26TH ANNUAL INTERNATIONAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE, PROCEEDINGS, 2002, : 481 - 486
  • [30] Patent citation network analysis: Ranking: From web pages to patents
    Érdi, Péter
    Bruck, Péter
    [J]. 2016, 9886 LNCS