Search and Browse Log Mining for Web Information Retrieval: Challenges, Methods, and Applications

被引:0
|
作者
Jiang, Daxin [1 ]
Pei, Jian
Li, Hang [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
关键词
Search and browse logs; log data mining;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Huge amounts of search log data have been accumulated in various search engines. Currently, a commercial search engine receives billions of queries and collects tera-bytes of log data on any single day. Other than search log data, browse logs can be collected by client-side browser plug-ins, which record the browse information if users' permissions are granted. Such massive amounts of search/browse log data, on the one hand, provide great opportunities to mine the wisdom of crowds and improve search results as well as online advertisement. On the other hand, designing effective and efficient methods to clean, model, and process large scale log data also presents great challenges. In this tutorial, we focus on mining search and browse log data for Web information retrieval. We consider a Web information retrieval system consisting of four components, namely, query understanding, document understanding, query-document matching, and user understanding. Accordingly, we organize the tutorial materials along these four aspects. For each aspect, we will survey the major tasks, challenges, fundamental principles, and state-of-the-art methods. The goal of this tutorial is to provide a systematic survey on large-scale search/browse log mining to the IR community. It will help IR researchers to get familiar with the core challenges and promising directions in log mining. At the same time, this tutorial may also serve the developers of Web information retrieval systems as a comprehensive and in-depth reference to the advanced log mining techniques.
引用
收藏
页码:912 / 912
页数:1
相关论文
共 50 条
  • [41] Multiple perspective interactive search: a paradigm for exploratory search and information retrieval on the web
    Rahul Singh
    Ya-Wen Hsu
    Naureen Moon
    Multimedia Tools and Applications, 2013, 62 : 507 - 543
  • [42] Multiple perspective interactive search: a paradigm for exploratory search and information retrieval on the web
    Singh, Rahul
    Hsu, Ya-Wen
    Moon, Naureen
    MULTIMEDIA TOOLS AND APPLICATIONS, 2013, 62 (02) : 507 - 543
  • [43] Query-log based authority analysis for web information search
    Luxenburger, J
    Weikum, G
    WEB INFORMATION SYSTEMS - WISE 2004, PROCEEDINGS, 2004, 3306 : 90 - 101
  • [44] Information Retrieval Systems and Search Engines on the web: present and forecast
    Souza, Renato Rocha
    PERSPECTIVAS EM CIENCIA DA INFORMACAO, 2006, 11 (02): : 161 - 173
  • [45] Comparative evaluation of web search engines in health information retrieval
    Lopes, Carla Teixeira
    Ribeiro, Cristina
    ONLINE INFORMATION REVIEW, 2011, 35 (06) : 869 - 892
  • [46] Web user's information retrieval methods and skills
    Bond, CS
    ONLINE INFORMATION REVIEW, 2004, 28 (04) : 254 - 259
  • [47] An Efficient Web Search Engine for Noisy Free Information Retrieval
    Sahoo, Pradeep
    Parthasarthy, Rajagopalan
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2018, 15 (03) : 412 - 418
  • [48] Designing interaction paradigms for web-information search and retrieval
    Hsu, Ya-Wen
    Moon, Naureen
    Singh, Rahul
    2006 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE, (WI 2006 MAIN CONFERENCE PROCEEDINGS), 2006, : 815 - +
  • [49] SURVEY AND ANALYSIS OF COURSES ON THE SUBJECT OF INFORMATION RETRIEVAL AND WEB SEARCH
    Meng, Xiannong
    Xing, Song
    Wei, Wang
    2012 ASEE ANNUAL CONFERENCE, 2012,
  • [50] Semantic web search model for information retrieval of the semantic data
    Choi, O
    Yoon, S
    Oh, M
    Han, S
    WEB AND COMMUNICATION TECHNOLOGIES AND INTERNET-RELATED SOCIAL ISSUES - HSI 2003, 2003, 2713 : 588 - 593