Web Search and Browse Log Mining: Challenges, Methods, and Applications

被引:0
|
作者
Jiang, Daxin [1 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
关键词
Search and browse logs; log data summarization; log mining applications;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Huge amounts of search log data have been accumulated in various search engines. Currently, a commercial search engine receives billions of queries and collects tera-bytes of log data on any single day. Other than search log data, browse logs can be collected by client-side browser plug-ins, which record the browse information if users' permissions are granted. Such massive amounts of search/browse log data, on the one hand, provide great opportunities to mine the wisdom of crowds and improve search results as well as online advertisement. On the other hand, designing effective and efficient methods to clean, model, and process large scale log data also presents great challenges. In this tutorial, I will focus on mining search and browse log data for search engines. I will start with an introduction of search and browse log data and an overview of frequently-used data summarization in log mining. I will then elaborate how log mining applications enhance the five major components of a search engine, namely, query understanding, document understanding, query-document matching, user understanding, and monitoring and feedbacks. For each aspect, I will survey the major tasks, fundamental principles, and state-of-the-art methods. Finally, I will discuss the challenges and future trends of log data mining.
引用
收藏
页码:465 / 466
页数:2
相关论文
共 50 条
  • [21] Web-log mining for predictive Web caching
    Yang, Q
    Zhang, HH
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (04) : 1050 - 1053
  • [22] Web usage log markup language for web mining
    Zhang, Hui
    Song, Hantao
    Punine, John R.
    Journal of Computational Information Systems, 2007, 3 (03): : 971 - 980
  • [23] CHALLENGES FOR WEB MINING
    Kumar, A. Senthil
    Palanisamy, N.
    ICCN: 2008 INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING, 2008, : 630 - +
  • [24] Multilingual Probabilistic Topic Modeling and its Applications in Web Mining and Search
    Moens, Marie-Francine
    Vulic, Ivan
    WSDM'14: PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2014, : 681 - 681
  • [25] The laborious way from data mining to web log mining
    Spiliopoulou, M
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 1999, 14 (02): : 113 - 126
  • [26] Web log data warehousing and mining for intelligent web caching
    Bonchi, F
    Giannotti, F
    Gozzi, C
    Manco, G
    Nanni, M
    Pedreschi, D
    Renso, C
    Ruggieri, S
    DATA & KNOWLEDGE ENGINEERING, 2001, 39 (02) : 165 - 189
  • [27] An overview of preprocessing of Web log files for Web usage mining
    Department of Computer Science, SDNB Vaishnav College for Women, Chennai, Tamil Nadu, India
    不详
    不详
    J. Theor. Appl. Inf. Technol., 2 (178-185):
  • [28] A Constraint Programming Approach for Web Log Mining
    Kemmar, Amina
    Lebbah, Yahia
    Loudni, Samir
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2016, 11 (04) : 24 - 42
  • [29] Frequent Sequence Mining in Web Log Data
    Weichbroth, Pawel
    MAN-MACHINE INTERACTIONS 5, ICMMI 2017, 2018, 659 : 459 - 467
  • [30] Simple Web log mining system (SWLMS)
    Yang, Yiling
    Guan, Xudong
    Lu, Lina
    You, Jinyuan
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2000, 34 (07): : 932 - 935