DSphere: A source-centric approach to crawling, indexing and searching the world wide web

被引:0
|
作者
Bamba, Bhuvan [1 ]
Liu, Ling [1 ]
Caverlee, James [1 ]
Padliya, Vaibhav [1 ]
Srivatsa, Mudhakar [1 ]
Bansal, Tushar [1 ]
Palekar, Mahesh [1 ]
Patrao, Joseph [1 ]
Li, Suiyang [1 ]
Singh, Aameek [1 ]
机构
[1] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
来源
2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3 | 2007年
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We describe DSPHERE1 - a decentralized system for crawling, indexing, searching and ranking of documents in the World Wide Web. Unlike most of the existing search technologies that depend heavily on a page-centric view of the Web, we advocate a source-centric view of the Web and propose a decentralized architecture for crawling, indexing and searching the Web in a distributed source-specific fashion. A fully decentralized crawler is developed to crawl the World Wide Web where each peer is assigned the responsibility of crawling a specific set of documents referred to as a source collection. Link analysis techniques are used for ranking documents. Traditional link analysis techniques suffer from problems like slow refresh rate and vulnerabilities to Web Spam. We propose a source-based link analysis approach, which computes fast and accurate ranking scores for all crawled documents.
引用
收藏
页码:1490 / +
页数:2
相关论文
共 50 条
  • [21] Searching for salvation: An analysis of US religious searching on the World Wide Web
    Jansen, Bernard J.
    Tapia, Andrea
    Spink, Amanda
    RELIGION, 2010, 40 (01) : 39 - 52
  • [22] Experimental study of searching strategy on World Wide Web
    Miura, A
    Fujihara, N
    INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2000, 35 (3-4) : 84 - 84
  • [23] Searching for wisdom on the World Wide Web: A modest beginning
    Denault, PL
    BUSINESS AND ECONOMIC HISTORY, VOL 25, NO 1, FALL 1996, 1996, : 50 - 54
  • [24] Searching the World Wide Web: Challenges and partial solutions
    BaezaYates, RA
    PROGRESS IN ARTIFICIAL INTELLIGENCE-IBERAMIA 98, 1998, 1484 : 39 - 51
  • [25] In search of the unknown user: Indexing, hypertext and the World Wide Web
    Ellis, D
    Ford, N
    Furner, J
    JOURNAL OF DOCUMENTATION, 1998, 54 (01) : 28 - 47
  • [26] Content-centric interactive video on the World Wide Web
    Katkere, A
    Schlenzig, J
    Jain, R
    COMPUTER NETWORKS AND ISDN SYSTEMS, 1997, 29 (8-13): : 887 - 895
  • [27] Apoidea: A decentralized Peer-to-Peer architecture for crawling the World Wide Web
    Singh, A
    Srivatsa, M
    Liu, L
    Miller, T
    DISTRIBUTED MULTIMEDIA INFORMATION RETRIEVAL, 2004, 2924 : 126 - 142
  • [28] Searching the world-wide Web: Lycos, WebCrawler and more
    Notess, Greg R.
    Online (Wilton, Connecticut), 1995, 19 (04):
  • [29] Patterns of searching for information on the World Wide Web: A pilot study
    Fujihara, N
    Miura, A
    PSYCHOLOGICAL REPORTS, 2003, 92 (03) : 1091 - 1096
  • [30] Sequence databases and homology searching using World Wide Web
    Paterson, M
    MOLECULAR MEDICINE TODAY, 1996, 2 (03): : 98 - 102