Mining Query Subtopics from Search Log Data

被引:0
|
作者
Hu, Yunhua [1 ]
Qian, Yanan [2 ]
Li, Hang [1 ]
Jiang, Daxin [1 ]
Pei, Jian [3 ]
Zheng, Qinghua [2 ]
机构
[1] Microsoft Res Asia, Beijing, Peoples R China
[2] Xi An Jiao Tong Univ, SPKLSTN Lab, Xian, Peoples R China
[3] Simon Fraser Univ, Burnaby, BC, Canada
基金
美国国家科学基金会; 加拿大自然科学与工程研究理事会;
关键词
Search Log Mining; User Behavior; Query Subtopics; Clustering; Search Result Clustering;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Most queries in web search are ambiguous and multifaceted. Identifying the major senses and facets of queries from search log data, referred to as query subtopic mining in this paper, is a very important issue in web search. Through search log analysis, we show that there are two interesting phenomena of user behavior that can be leveraged to identify query subtopics, referred to as 'one subtopic per search' and 'subtopic clarification by keyword'. One subtopic per search means that if a user clicks multiple URLs in one query, then the clicked URLs tend to represent the same sense or facet. Subtopic clarification by keyword means that users often add an additional keyword or keywords to expand the query in order to clarify their search intent. Thus, the keywords tend to be indicative of the sense or facet. We propose a clustering algorithm that can effectively leverage the two phenomena to automatically mine the major subtopics of queries, where each subtopic is represented by a cluster containing a number of URLs and keywords. The mined subtopics of queries can be used in multiple tasks in web search and we evaluate them in aspects of the search result presentation such as clustering and re-ranking. We demonstrate that our clustering algorithm can effectively mine query subtopics with an F1 measure in the range of 0.896-0.956. Our experimental results show that the use of the subtopics mined by our approach can significantly improve the state-of-the-art methods used for search result clustering. Experimental results based on click data also show that the re-ranking of search result based on our method can significantly improve the efficiency of users' ability to find information.
引用
收藏
页码:305 / 314
页数:10
相关论文
共 50 条
  • [1] Learning to Mine Query Subtopics from Query Log
    Zhang, Zhenzhong
    Sun, Le
    Han, Xianpei
    [J]. PROCEEDINGS OF THE 53RD ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL) AND THE 7TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (IJCNLP), VOL 2, 2015, : 341 - 345
  • [2] Mining query subtopics from social tags
    Zhitomirsky-Geffet, Maayan
    Daya, Yossi
    [J]. INFORMATION RESEARCH-AN INTERNATIONAL ELECTRONIC JOURNAL, 2015, 20 (02):
  • [3] Mining subtopics from text fragments for a web query
    Qinglei Wang
    Yanan Qian
    Ruihua Song
    Zhicheng Dou
    Fan Zhang
    Tetsuya Sakai
    Qinghua Zheng
    [J]. Information Retrieval, 2013, 16 : 484 - 503
  • [4] Dynamic Query Intent Mining from a Search Log Stream
    Qian, Yanan
    Sakai, Tetsuya
    Ye, Junting
    Zheng, Qinghua
    Li, Cong
    [J]. PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1205 - 1208
  • [5] Mining subtopics from text fragments for a web query
    Wang, Qinglei
    Qian, Yanan
    Song, Ruihua
    Dou, Zhicheng
    Zhang, Fan
    Sakai, Tetsuya
    Zheng, Qinghua
    [J]. INFORMATION RETRIEVAL, 2013, 16 (04): : 484 - 503
  • [6] Mining Query Subtopics from Questions in Community Question Answering
    Wu, Yu
    Wu, Wei
    Li, Zhoujun
    Zhou, Ming
    [J]. PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 339 - 345
  • [7] Privacy in Web Search Query Log Mining
    Jones, Rosie
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT I, 2009, 5781 : 4 - 4
  • [8] Mining subtopics from different aspects for diversifying search results
    Wang, Chieh-Jen
    Lin, Yung-Wei
    Tsai, Ming-Feng
    Chen, Hsin-Hsi
    [J]. INFORMATION RETRIEVAL, 2013, 16 (04): : 452 - 483
  • [9] Mining Relevant Time for Query Subtopics in Web Archives
    Tu Ngoc Nguyen
    Kanhabua, Nattiya
    Nejdl, Wolfgang
    Niederee, Claudia
    [J]. WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 1356 - 1361
  • [10] Mining subtopics from different aspects for diversifying search results
    Chieh-Jen Wang
    Yung-Wei Lin
    Ming-Feng Tsai
    Hsin-Hsi Chen
    [J]. Information Retrieval, 2013, 16 : 452 - 483