Incremental C-Rank: An effective and efficient ranking algorithm for dynamic Web environments

被引:6
|
作者
Koo, Jangwan [1 ]
Chae, Dong-Kyu [1 ]
Kim, Dong-Jin [2 ]
Kim, Sang-Wook [1 ]
机构
[1] Hanyang Univ, Seoul, South Korea
[2] Brainsoft Inc, Seongnam, South Korea
基金
新加坡国家研究基金会;
关键词
Information retrieval; Ranking algorithm; Dynamic ranking; INFORMATION; SIMILARITY; RETRIEVAL; SEARCH; LINKS;
D O I
10.1016/j.knosys.2019.03.034
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Web page ranking is one of the core components of search engines. Given a user query, ranking aims to provide a ranked list of Web pages that the user is likely to prefer the most. By and large, the ranking algorithms can be categorized into content-based approaches, link-based approaches, and hybrid approaches. Hybrid ranking algorithms, which exploit both the content and link information, are the most popular and extensively studied techniques. Among the hybrid algorithms, C-Rank combines content and link information in a very effective way using the concept of contribution. This algorithm is known to provide high performance in terms of both accurate and prompt responses to user queries. However, C-Rank suffers from very high costs to reflect the highly dynamic and extremely frequent changes in the World Wide Web, because it re-computes all of the C-Rank scores used for ranking from scratch to reflect the changes. As a result, C-Rank may be considered inappropriate to provide users with accurate and up-to-date search results. This paper aims to remedy this limitation of C-Rank. We propose incremental C-Rank, which is designed to update the C-Rank scores of only a carefully chosen portion of the Web pages rather than those of all of the Web pages without any accuracy loss. Our experimental results on a real-world dataset confirm both the effectiveness and efficiency of our proposed method. (C) 2019 Elsevier B.V. All rights reserved. Web page ranking is one of the core components of search engines. Given a user query, ranking aims to provide a ranked list of Web pages that the user is likely to prefer the most. By and large, the ranking algorithms can be categorized into content-based approaches, link-based approaches, and hybrid approaches. Hybrid ranking algorithms, which exploit both the content and link information, are the most popular and extensively studied techniques. Among the hybrid algorithms, C-Rank combines content and link information in a very effective way using the concept of contribution. This algorithm is known to provide high performance in terms of both accurate and prompt responses to user queries. However, C-Rank suffers from very high costs to reflect the highly dynamic and extremely frequent changes in the World Wide Web, because it re-computes all of the C-Rank scores used for ranking from scratch to reflect the changes. As a result, C-Rank may be considered inappropriate to provide users with accurate and up-to-date search results. This paper aims to remedy this limitation of C-Rank. We propose incremental C-Rank, which is designed to update the C-Rank scores of only a carefully chosen portion of the Web pages rather than those of all of the Web pages without any accuracy loss. Our experimental results on a real-world dataset confirm both the effectiveness and efficiency of our proposed method. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:147 / 158
页数:12
相关论文
共 46 条
  • [1] Incremental Maintenance of C-Rank Scores in Dynamic Web Environment
    Koo, Jangwan
    Kim, Dong-Jin
    Chae, Dong-Kyu
    Kim, Sang-Wook
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 1570 - 1574
  • [2] An Effective and Efficient Algorithm for Ranking Web Documents via Genetic Programming
    Baeza-Yates, Ricardo
    Cuzzocrea, Alfredo
    Crea, Domenico
    Lo Bianco, Giovanni
    [J]. SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 1065 - 1072
  • [3] C-Rank and its variants: A contribution-based ranking approach exploiting links and content
    Kim, Dong-Jin
    Lee, Sang-Chul
    Son, Ho-Yong
    Kim, Sang-Wook
    Lee, Jae Bum
    [J]. JOURNAL OF INFORMATION SCIENCE, 2014, 40 (06) : 761 - 778
  • [4] An efficient algorithm to rank Web resources
    Zhang, D
    Dong, YS
    [J]. COMPUTER NETWORKS, 2000, 33 (1-6) : 449 - 455
  • [5] An Efficient Web Image Annotation Ranking Algorithm
    Zheng, Liu
    [J]. PROGRESS IN MEASUREMENT AND TESTING, PTS 1 AND 2, 2010, 108-111 : 81 - 87
  • [6] Trackback-Rank: An effective Ranking Algorithm for the Blog Search
    Kim, Jung-Hoon
    Yoon, Tae-Bok
    Kim, Kun-Su
    Lee, Jee-Hyong
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATION, VOL III, PROCEEDINGS, 2008, : 503 - 507
  • [7] Effective Model And Implementation Of Dynamic Ranking In Web Pages
    Divjot
    Singh, Jaiteg
    [J]. 2015 FIFTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT2015), 2015, : 1010 - 1014
  • [8] An Efficient Incremental Mining Algorithm for Dynamic Databases
    Driff, Lydia Nahla
    Drias, Habiba
    [J]. MINING INTELLIGENCE AND KNOWLEDGE EXPLORATION (MIKE 2016), 2017, 10089 : 1 - 12
  • [9] An efficient incremental algorithm for mining web traversal patterns
    Yen, SJ
    Lee, YS
    Hsieh, MC
    [J]. ICEBE 2005: IEEE INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING, PROCEEDINGS, 2005, : 274 - 281
  • [10] C-Rank:一种Deep Web数据记录可信度评估方法
    艾静
    王仲远
    孟小峰
    [J]. 计算机科学与探索, 2009, 3 (06) : 585 - 593