DSphere: A source-centric approach to crawling, indexing and searching the world wide web

被引:0
|
作者
Bamba, Bhuvan [1 ]
Liu, Ling [1 ]
Caverlee, James [1 ]
Padliya, Vaibhav [1 ]
Srivatsa, Mudhakar [1 ]
Bansal, Tushar [1 ]
Palekar, Mahesh [1 ]
Patrao, Joseph [1 ]
Li, Suiyang [1 ]
Singh, Aameek [1 ]
机构
[1] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We describe DSPHERE1 - a decentralized system for crawling, indexing, searching and ranking of documents in the World Wide Web. Unlike most of the existing search technologies that depend heavily on a page-centric view of the Web, we advocate a source-centric view of the Web and propose a decentralized architecture for crawling, indexing and searching the Web in a distributed source-specific fashion. A fully decentralized crawler is developed to crawl the World Wide Web where each peer is assigned the responsibility of crawling a specific set of documents referred to as a source collection. Link analysis techniques are used for ranking documents. Traditional link analysis techniques suffer from problems like slow refresh rate and vulnerabilities to Web Spam. We propose a source-based link analysis approach, which computes fast and accurate ranking scores for all crawled documents.
引用
收藏
页码:1490 / +
页数:2
相关论文
共 50 条
  • [1] AJAXSearch: Crawling, Indexing and Searching Web 2.0 Applications
    Duda, Cristian
    Frey, Gianni
    Kossmann, Donald
    Zhout, Chong
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02): : 1440 - 1443
  • [2] Link-based ranking of the web with source-centric collaboration
    Caverlee, James
    Liu, Ling
    Rouse, William B.
    2006 INTERNATIONAL CONFERENCE ON COLLABORATIVE COMPUTING: NETWORKING, APPLICATIONS AND WORKSHARING, 2006, : 61 - +
  • [3] A Novel Approach for Crawling the Opinions from World Wide Web
    Bhatia, Surbhi
    Sharma, Manisha
    Bhatia, Komal Kumar
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2016, 6 (02) : 1 - 23
  • [4] Searching the World Wide Web
    Lawrence, S
    Giles, CL
    SCIENCE, 1998, 280 (5360) : 98 - 100
  • [5] Searching the World Wide Web
    Bekavac, B
    NACHRICHTEN FUR DOKUMENTATION, 1996, 47 (04): : 195 - 213
  • [6] Searching the World Wide Web
    Mauldin, ML
    Selberg, E
    Etzioni, O
    IEEE EXPERT-INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 1997, 12 (01): : 8 - 11
  • [7] Searching for multimedia on the World Wide Web
    Cambridge Research Lab, Cambridge, United States
    Int Conf Multimedia Comput Syst Proc, (32-37):
  • [8] Searching for multimedia on the World Wide Web
    Swain, MJ
    IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA COMPUTING AND SYSTEMS, PROCEEDINGS VOL 1, 1999, : 32 - 37
  • [9] Searching for information on the world wide web
    Baggott, L
    Nichol, J
    Watson, K
    Poland, R
    JOURNAL OF BIOLOGICAL EDUCATION, 1999, 33 (03) : 158 - 163
  • [10] Indexing pharmacogenetic knowledge on the World Wide Web
    Altman, RB
    Flockhart, DA
    Sherry, ST
    Oliver, DE
    Rubin, DL
    Klein, TE
    PHARMACOGENETICS, 2003, 13 (01): : 3 - 5