Query-Sensitive Similarity Measures for Information Retrieval

被引:0
|
作者
Anastasios Tombros
C.J. van Rijsbergen
机构
[1] University of Glasgow,Department of Computing Science
来源
关键词
Information retrieval; Document clustering; Similarity measures; Nearest neighbor searching;
D O I
暂无
中图分类号
学科分类号
摘要
The application of document clustering to information retrieval has been motivated by the potential effectiveness gains postulated by the cluster hypothesis. The hypothesis states that relevant documents tend to be highly similar to each other and therefore tend to appear in the same clusters. In this paper we propose an axiomatic view of the hypothesis by suggesting that documents relevant to the same query (co-relevant documents) display an inherent similarity to each other that is dictated by the query itself. Because of this inherent similarity, the cluster hypothesis should be valid for any document collection. Our research describes an attempt to devise means by which this similarity can be detected. We propose the use of query-sensitive similarity measures that bias interdocument relationships toward pairs of documents that jointly possess attributes expressed in a query. We experimentally tested three query-sensitive measures against conventional ones that do not take the query into account, and we also examined the comparative effectiveness of the three query-sensitive measures. We calculated interdocument relationships for varying numbers of top-ranked documents for six document collections. Our results show a consistent and significant increase in the number of relevant documents that become nearest neighbors of any given relevant document when query-sensitive measures are used. These results suggest that the effectiveness of a cluster-based information retrieval system has the potential to increase through the use of query-sensitive similarity measures.
引用
收藏
页码:617 / 642
页数:25
相关论文
共 50 条
  • [41] Evaluation of similarity measures for video retrieval
    Bekhet, Saddam
    Ahmed, Amr
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (9-10) : 6265 - 6278
  • [42] Information retrieval by semantic similarity
    Hliaoutakis, Angelos
    Varelas, Giannis
    Voutsakis, Epimenidis
    Petrakis, Euripides G. M.
    Milios, Evangelos
    INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS, 2006, 2 (03) : 55 - 73
  • [43] Adapting information retrieval to query contexts
    Bai, Jing
    Nie, Jian-Yun
    INFORMATION PROCESSING & MANAGEMENT, 2008, 44 (06) : 1901 - 1922
  • [44] Information retrieval and rendering with MML query
    Bancerek, Grzegorz
    MATHEMATICAL KNOWLEDGE MANAGEMENT, PROCEEDINGS, 2006, 4108 : 266 - 279
  • [45] Query formulation as an information retrieval problem
    TerHofstede, AHM
    Proper, HA
    VanderWeide, TP
    COMPUTER JOURNAL, 1996, 39 (04): : 255 - 274
  • [46] Estimating the Query Difficulty for Information Retrieval
    Carmel, David
    Yom-Tov, Elad
    SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, 2010, : 911 - 911
  • [47] Using query contexts in information retrieval
    Bai, Jing
    Nie, Jian-Yun
    Cao, Guihong
    Bouchard, Hugues
    Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR'07, 2007, : 15 - 22
  • [48] Query Adaptive Similarity for Large Scale Object Retrieval
    Qin, Danfeng
    Wengert, Christian
    van Gool, Luc
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 1610 - 1617
  • [49] Parallel information retrieval with query expansion
    Chung, YJ
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2004, E87D (06) : 1593 - 1595
  • [50] Content-Based retrieval supporting similarity query
    Yoon, MH
    Kim, KC
    Yoon, YI
    INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, PROCEEDINGS, 1999, : 218 - 224