Web Page Clustering using Heuristic Search in the Web Graph

被引:0
|
作者
Bekkerman, Ron [1 ]
Zilberstein, Shlomo [1 ]
Allan, James [1 ]
机构
[1] Univ Massachusetts, Amherst, MA 01003 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Effective representation of Web search results remains an open problem in the Information Retrieval community. For ambiguous queries, a traditional approach is to organize search results into groups (clusters), one for each meaning of the query. These groups are usually constructed according to the topical similarity of the retrieved documents, but it is possible for documents to be totally dissimilar and still correspond to the same meaning of the query. To overcome this problem, we exploit the thematic locality of the Web-relevant Web pages are often located close to each other in the Web graph of hyperlinks. We estimate the level of relevance between each pair of retrieved pages by the length of a path between them. The path is constructed using multi-agent beam search: each agent starts with one Web page and attempts to meet as many other agents as possible with some bounded resources. We test the system on two types of queries: ambiguous English words and people names. The Web appears to be tightly connected; about 70% of the agents meet with each other after only three iterations of exhaustive breadth-first search. However, when heuristics are applied, the search becomes more focused and the obtained results are substantially more accurate. Combined with a content-driven Web page clustering technique, our heuristic search system significantly improves the clustering results.
引用
收藏
页码:2280 / 2285
页数:6
相关论文
共 50 条
  • [1] Web page clustering using Harmony Search optimization
    Forsati, Rana
    Mahdavi, Mehrdad
    Kangavari, Mohammadreza
    Safarkhani, Banafsheh
    [J]. 2008 CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, VOLS 1-4, 2008, : 1530 - +
  • [2] A Novel Heuristic Page Rank Algorithm in Web Search
    He Yan-li
    [J]. OPTICAL, ELECTRONIC MATERIALS AND APPLICATIONS, PTS 1-2, 2011, 216 : 747 - 751
  • [3] Improvement of web data clustering using web page contents
    Xu, Y
    Weng, LT
    [J]. INTELLIGENT INFORMATION PROCESSING II, 2005, 163 : 521 - 530
  • [4] An Evolutionary Web Clustering for Web Page Predicting
    Wu, Rui
    Zhang, Ling
    [J]. JOURNAL OF INTERNET TECHNOLOGY, 2017, 18 (01): : 147 - 155
  • [5] Personalized Web Search Using Clickthrough Data and Web Page Rating
    Peng, XuePing
    Niu, ZhenDong
    Huang, Sheng
    Zhao, Yumin
    [J]. JOURNAL OF COMPUTERS, 2012, 7 (10) : 2578 - 2584
  • [6] A model of web page clustering using artificial ants
    Su, Yidan
    Dai, Shengxian
    Gu, Xinyi
    [J]. 2005 INTERNATIONAL SYMPOSIUM ON COMPUTER SCIENCE AND TECHNOLOGY, PROCEEDINGS, 2005, : 206 - 210
  • [7] WISE: Hierarchical soft clustering of web page search results based on web content mining techniques
    Campos, Ricardo
    Dias, Gaeal
    Nunes, Celia
    [J]. 2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 301 - +
  • [8] Semantic Web and Web Page Clustering Algorithms: A Landscape View
    Obaid, Ahmed J.
    Chatterjee, Tanusree
    Bhattacharya, Abhishek
    [J]. Obaid, Ahmed J. (ahmedj.aljanaby@uokufa.edu.iq), 1600, European Alliance for Innovation (08): : 1 - 14
  • [9] Arabic Web page clustering: A review
    Alghamdi, Hanan M.
    Selamat, Ali
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2019, 31 (01) : 1 - 14
  • [10] WhatsOnWeb: Using graph drawing to search the Web
    Di Giacomo, E
    Didimo, W
    Grilli, L
    Liotta, G
    [J]. GRAPH DRAWING, 2006, 3843 : 480 - 491