Discovering and understanding word level user intent in Web search queries

被引:9
|
作者
Roy, Rishiraj Saha [1 ]
Katare, Rahul [1 ]
Ganguly, Niloy [1 ]
Laxman, Srivatsan [2 ]
Choudhury, Monojit [3 ]
机构
[1] Indian Inst Technol Kharagpur, Comp Sci & Engn, Kharagpur, W Bengal, India
[2] Scibler Technol Private Ltd, Bengaluru, Karnataka, India
[3] Microsoft Res India, Bengaluru, Karnataka, India
来源
JOURNAL OF WEB SEMANTICS | 2015年 / 30卷
关键词
Query understanding; Query intent; Intent words; Co-occurrence entropy; TERM PROXIMITY;
D O I
10.1016/j.websem.2014.07.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identifying and interpreting user intent are fundamental to semantic search. In this paper, we investigate the association of intent with individual words of a search query. We propose that words in queries can be classified as either content or intent, where content words represent the central topic of the query, while users add intent words to make their requirements more explicit. We argue that intelligent processing of intent words can be vital to improving the result quality, and in this work we focus on intent word discovery and understanding. Our approach towards intent word detection is motivated by the hypotheses that query intent words satisfy certain distributional properties in large query logs similar to function words in natural language corpora. Following this idea, we first prove the effectiveness of our corpus distributional features, namely, word co-occurrence counts and entropies, towards function word detection for five natural languages. Next, we show that reliable detection of intent words in queries is possible using these same features computed from query logs. To make the distinction between content and intent words more tangible, we additionally provide operational definitions of content and intent words as those words that should match, and those that need not match, respectively, in the text of relevant documents. In addition to a standard evaluation against human annotations, we also provide an alternative validation of our ideas using clickthrough data. Concordance of the two orthogonal evaluation approaches provide further support to our original hypothesis of the existence of two distinct word classes in search queries. Finally, we provide a taxonomy of intent words derived through rigorous manual analysis of large query logs. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:22 / 38
页数:17
相关论文
共 50 条
  • [41] Discovering Web services in search engines
    Al-Masri, Eyhab
    Mahmoud, Qusay H.
    IEEE INTERNET COMPUTING, 2008, 12 (03) : 74 - 77
  • [42] Understanding Queries in a Search Database System
    Fagin, Ronald
    Kimelfeld, Benny
    Li, Yunyao
    Raghavan, Sriram
    Vaithyanathan, Shivakumar
    PODS 2010: PROCEEDINGS OF THE TWENTY-NINTH ACM SIGMOD-SIGACT-SIGART SYMPOSIUM ON PRINCIPLES OF DATABASE SYSTEMS, 2010, : 273 - 284
  • [43] Understanding Search Queries in Natural Language
    Neverilova, Zuzana
    Kvassay, Matej
    RASLAN 2018: RECENT ADVANCES IN SLAVONIC NATURAL LANGUAGE PROCESSING, 2018, : 85 - 93
  • [44] Exploiting User Queries for Search Result Clustering
    Wahid, Abdul
    Gao, Xiaoying
    Andreae, Peter
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013, PT I, 2013, 8180 : 111 - 120
  • [45] Discovering user profiles for Web personalized recommendation
    Ai-Bo Song
    Mao-Xian Zhao
    Zuo-Peng Liang
    Yi-Sheng Dong
    Jun-Zhou Luo
    Journal of Computer Science and Technology, 2004, 19 : 320 - 328
  • [46] Bringing Semantic Structures to User Intent Detection in Online Medical Queries
    Zhang, Chenwei
    Du, Nan
    Fan, Wei
    Li, Yaliang
    Lu, Chun-Ta
    Yu, Philip S.
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 1019 - 1026
  • [47] Discovering user profiles for Web personalized recommendation
    Song, AB
    Zhao, MX
    Liang, ZP
    Dong, YS
    Luo, JZ
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2004, 19 (03) : 320 - 328
  • [48] An Intent Taxonomy for Questions Asked in Web Search
    Cambazoglu, B. Barla
    Tavakoli, Leila
    Scholer, Falk
    Sanderson, Mark
    Croft, Bruce
    CHIIR '21: PROCEEDINGS OF THE 2021 CONFERENCE ON HUMAN INFORMATION INTERACTION AND RETRIEVAL, 2021, : 85 - 94
  • [49] Enhancing Web Search by Aggregating Results of Related Web Queries
    Li, Lin
    Xu, Guandong
    Zhang, Yanchun
    Kitsuregawa, Masaru
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2009, PROCEEDINGS, 2009, 5802 : 203 - +
  • [50] Learning Temporal Ambiguity in Web Search Queries
    Mansouri, Behrooz
    Zahedi, Mohammad Sadegh
    Rahgozar, Maseud
    Oroumchian, Farhad
    Campos, Ricardo
    CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, : 2191 - 2194