Mining Search Engine Query Logs via Suggestion Sampling

被引:0
|
作者
Bar-Yossef, Ziv [1 ,2 ]
Gurevich, Maxim [1 ]
机构
[1] Technion, Dept Elect Engn, IL-32000 Haifa, Israel
[2] Google Haifa Engn Ctr, Haifa, Israel
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2008年 / 1卷 / 01期
基金
以色列科学基金会;
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Many search engines and other web applications suggest auto-completions as the user types in a query. The suggestions are generated from hidden underlying databases, such as query logs, directories, and lexicons. These databases consist of interesting and useful information, but they are typically not directly accessible. In this paper we describe two algorithms for sampling suggestions using only the public suggestion interface. One of the algorithms samples suggestions uniformly at random and the other samples suggestions proportionally to their popularity. These algorithms can be used to mine the hidden suggestion databases. Example applications include comparison of popularity of given keywords within a search engine's query log, estimation of the volume of commercially oriented queries in a query log, and evaluation of the extent to which a search engine exposes its users to negative content. Our algorithms employ Monte Carlo methods in order to obtain unbiased samples from the suggestion database. Empirical analysis using a publicly available query log demonstrates that our algorithms are efficient and accurate. Results of experiments on two major suggestion services are also provided.
引用
收藏
页码:54 / 65
页数:12
相关论文
共 50 条
  • [1] Mining Named Entities from Search Engine Query Logs
    Alasiry, Areej
    Levene, Mark
    Poulovassilis, Alexandra
    [J]. PROCEEDINGS OF THE 18TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM (IDEAS14), 2014, : 46 - 56
  • [2] Mining search engine query logs for social filtering-based query recommendation
    Zhang, Zhiyong
    Nasraoui, Olfa
    [J]. APPLIED SOFT COMPUTING, 2008, 8 (04) : 1326 - 1334
  • [3] Mining the Query Logs of a Chinese Web Search Engine for Character Usage Analysis
    Lu, Yan
    Chau, Michael
    Fang, Xiao
    [J]. PACIFIC ASIA CONFERENCE ON INFORMATION SYSTEMS 2006, SECTIONS 1-8, 2006, : 346 - +
  • [4] Mining Web search engines for query suggestion
    Xu, Zheng
    Luo, Xiangfeng
    Yu, Jie
    Xu, Weimin
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (10): : 1101 - 1113
  • [5] Data mining of search engine logs
    Whittle, Martin
    Eaglestone, Barry
    Ford, Nigel
    Gillet, Valerie J.
    Madden, Andrew
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2007, 58 (14): : 2382 - 2400
  • [6] Analysis of the query logs of a web site search engine
    Chau, M
    Fang, X
    Sheng, ORL
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2005, 56 (13): : 1363 - 1376
  • [7] Discovering Tasks from Search Engine Query Logs
    Lucchese, Claudio
    Orlando, Salvatore
    Perego, Raffaele
    Silvestri, Fabrizio
    Tolomei, Gabriele
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2013, 31 (03) : 1 - 43
  • [8] Mining Concept Sequences from Large-Scale Search Logs for Context-Aware Query Suggestion
    Liao, Zhen
    Jiang, Daxin
    Chen, Enhong
    Pei, Jian
    Cao, Huanhuan
    Li, Hang
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2012, 3 (01)
  • [9] Intent mining in search query logs for automatic search script generation
    Chieh-Jen Wang
    Hsin-Hsi Chen
    [J]. Knowledge and Information Systems, 2014, 39 : 513 - 542
  • [10] Intent mining in search query logs for automatic search script generation
    Wang, Chieh-Jen
    Chen, Hsin-Hsi
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2014, 39 (03) : 513 - 542