Discovering and understanding word level user intent in Web search queries

被引:9
|
作者
Roy, Rishiraj Saha [1 ]
Katare, Rahul [1 ]
Ganguly, Niloy [1 ]
Laxman, Srivatsan [2 ]
Choudhury, Monojit [3 ]
机构
[1] Indian Inst Technol Kharagpur, Comp Sci & Engn, Kharagpur, W Bengal, India
[2] Scibler Technol Private Ltd, Bengaluru, Karnataka, India
[3] Microsoft Res India, Bengaluru, Karnataka, India
来源
JOURNAL OF WEB SEMANTICS | 2015年 / 30卷
关键词
Query understanding; Query intent; Intent words; Co-occurrence entropy; TERM PROXIMITY;
D O I
10.1016/j.websem.2014.07.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identifying and interpreting user intent are fundamental to semantic search. In this paper, we investigate the association of intent with individual words of a search query. We propose that words in queries can be classified as either content or intent, where content words represent the central topic of the query, while users add intent words to make their requirements more explicit. We argue that intelligent processing of intent words can be vital to improving the result quality, and in this work we focus on intent word discovery and understanding. Our approach towards intent word detection is motivated by the hypotheses that query intent words satisfy certain distributional properties in large query logs similar to function words in natural language corpora. Following this idea, we first prove the effectiveness of our corpus distributional features, namely, word co-occurrence counts and entropies, towards function word detection for five natural languages. Next, we show that reliable detection of intent words in queries is possible using these same features computed from query logs. To make the distinction between content and intent words more tangible, we additionally provide operational definitions of content and intent words as those words that should match, and those that need not match, respectively, in the text of relevant documents. In addition to a standard evaluation against human annotations, we also provide an alternative validation of our ideas using clickthrough data. Concordance of the two orthogonal evaluation approaches provide further support to our original hypothesis of the existence of two distinct word classes in search queries. Finally, we provide a taxonomy of intent words derived through rigorous manual analysis of large query logs. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:22 / 38
页数:17
相关论文
共 50 条
  • [31] DWESM: An efficient entity-level search mechanism for deep web queries
    Kou, Yue
    Shen, Derong
    Nie, Tiezheng
    Yu, Ge
    Journal of Computational Information Systems, 2010, 6 (01): : 237 - 244
  • [32] Search Wandering Score: Predicting Timings of Online Shopping based on Wandering in User's Web Search Queries
    Tsubouchi, Kota
    Sasaki, Wataru
    Okoshi, Tadashi
    Nakazawa, Jin
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 1681 - 1688
  • [33] Investigation of Bias in Web Search Queries
    Haak, Fabian
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 : 443 - 449
  • [34] Identifying ambiguous queries in web search
    Shanghai Jiao Tong University, Shanghai, China
    不详
    不详
    Int. World Wide Web Conf., (1169-1170):
  • [35] Identification of ambiguous queries in web search
    Song, Ruihua
    Luo, Zhenxiao
    Nie, Jian-Yun
    Yu, Yong
    Hon, Hsiao-Wuen
    INFORMATION PROCESSING & MANAGEMENT, 2009, 45 (02) : 216 - 229
  • [36] An analysis of web image queries for search
    Pu, HT
    ASIST 2003: PROCEEDINGS OF THE 66TH ASIST ANNUAL MEETING, VOL 40, 2003: HUMANIZING INFORMATION TECHNOLOGY: FROM IDEAS TO BITS AND BACK, 2003, 40 : 340 - 348
  • [37] Web search queries and prostate cancer
    Cacciamani, Giovanni E.
    Gill, Karanvir
    Gill, Inderbir S.
    LANCET ONCOLOGY, 2020, 21 (04): : 494 - 496
  • [38] Understanding and Leveraging the Impact of Response Latency on User Behaviour in Web Search
    Bai, Xiao
    Arapakis, Ioannis
    Barla Cambazoglu, B.
    Freire, Ana
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2017, 36 (02)
  • [39] Predicting the intent of sponsored search users: An exploratory user session-level analysis
    Im, Il
    Dunn, Brian Kimball
    Lee, Dong Il
    Galletta, Dennis F.
    Jeong, Seok-Oh
    DECISION SUPPORT SYSTEMS, 2019, 121 : 25 - 36
  • [40] From Keywords to Queries: Discovering the User's Intended Meaning
    Bobed, Carlos
    Trillo, Raquel
    Mena, Eduardo
    Ilarri, Sergio
    WEB INFORMATION SYSTEM ENGINEERING-WISE 2010, 2010, 6488 : 190 - 203