A new approach to query segmentation for relevance ranking in web search

被引:4
|
作者
Wu, Haocheng [1 ]
Hu, Yunhua [2 ]
Li, Hang [3 ]
Chen, Enhong [1 ]
机构
[1] Univ Sci & Technol China, Hefei 230026, Peoples R China
[2] Alibaba Com, Beijing, Peoples R China
[3] Noahs Ark Lab Huawei Technol, Hong Kong, Hong Kong, Peoples R China
来源
INFORMATION RETRIEVAL JOURNAL | 2015年 / 18卷 / 01期
关键词
Web search; Query segmentation; Relevance ranking; Query processing; Re-ranking; BM25; Term dependency model; Key n-gram extraction;
D O I
10.1007/s10791-014-9246-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we try to determine how best to improve state-of-the-art methods for relevance ranking in web searching by query segmentation. Query segmentation is meant to separate the input query into segments, typically natural language phrases. We propose employing the re-ranking approach in query segmentation, which first employs a generative model to create the top k candidates and then employs a discriminative model to re-rank the candidates to obtain the final segmentation result. The method has been widely utilized for structure prediction in natural language processing, but has not been applied to query segmentation, as far as we know. Furthermore, we propose a new method for using the results of query segmentation in relevance ranking, which takes both the original query words and the segmented query phrases as units of query representation. We investigate whether our method can improve three relevance models, namely n-gram BM25, key n-gram model and term dependency model, within the framework of learning to rank. Our experimental results on large scale web search datasets show that our method can indeed significantly improve relevance ranking in all three cases.
引用
收藏
页码:26 / 50
页数:25
相关论文
共 50 条
  • [21] Ranking Relevance in Yahoo Search
    Yin, Dawei
    Hu, Yuening
    Tang, Jiliang
    Daly, Tim, Jr.
    Zhou, Mianwei
    Ouyang, Hua
    Chen, Jianhui
    Kang, Changsung
    Deng, Hongbo
    Nobata, Chikashi
    Langlois, Jean-Marc
    Chang, Yi
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 323 - 332
  • [22] A transduction-based approach to fuzzy clustering, relevance ranking and cluster label generation on web search results
    Takazumi Matsumoto
    Edward Hung
    Journal of Intelligent Information Systems, 2012, 38 : 419 - 448
  • [23] A transduction-based approach to fuzzy clustering, relevance ranking and cluster label generation on web search results
    Matsumoto, Takazumi
    Hung, Edward
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2012, 38 (02) : 419 - 448
  • [24] PSkip: Estimating Relevance Ranking Quality from Web Search Clickthrough Data
    Wang, Kuansan
    Walker, Toby
    Zheng, Zijian
    KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2009, : 1355 - 1363
  • [25] A Query Substitution-Search Result Refinement Approach for Long Query Web Searches
    Chen, Yan
    Zhang, Yan-Qing
    2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, 2009, : 245 - 251
  • [26] Numeric Query Ranking Approach
    Wu, Jie
    Liu, Yi
    Wen, Ji-Rong
    PROCEEDINGS OF THE 22ND INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'13 COMPANION), 2013, : 229 - 230
  • [27] Ranking categories for web search
    Demartini, Gianluca
    Chirita, Paul-Alexandru
    Brunkhorst, Ingo
    Nejdl, Wolfgang
    ADVANCES IN INFORMATION RETRIEVAL, 2008, 4956 : 564 - +
  • [28] Query clustering for boosting web page ranking
    BaezaYates, R
    Hurtado, C
    Mendoza, M
    ADVANCES IN WEB INTELLIGENCE, PROCEEDINGS, 2004, 3034 : 164 - 175
  • [29] Ranking Keyword Search Results with Query Logs
    Zhou, Jing
    Yu, Xiaohui
    Liu, Yang
    Yu, Ziqiang
    2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, : 770 - 771
  • [30] Ranking Methods for Query Relaxation in Book Search
    Kyozuka, Momo
    Tajima, Keishi
    2018 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2018), 2018, : 466 - 472