Query-Sensitive Similarity Measures for Information Retrieval

被引:0
|
作者
Anastasios Tombros
C.J. van Rijsbergen
机构
[1] University of Glasgow,Department of Computing Science
来源
关键词
Information retrieval; Document clustering; Similarity measures; Nearest neighbor searching;
D O I
暂无
中图分类号
学科分类号
摘要
The application of document clustering to information retrieval has been motivated by the potential effectiveness gains postulated by the cluster hypothesis. The hypothesis states that relevant documents tend to be highly similar to each other and therefore tend to appear in the same clusters. In this paper we propose an axiomatic view of the hypothesis by suggesting that documents relevant to the same query (co-relevant documents) display an inherent similarity to each other that is dictated by the query itself. Because of this inherent similarity, the cluster hypothesis should be valid for any document collection. Our research describes an attempt to devise means by which this similarity can be detected. We propose the use of query-sensitive similarity measures that bias interdocument relationships toward pairs of documents that jointly possess attributes expressed in a query. We experimentally tested three query-sensitive measures against conventional ones that do not take the query into account, and we also examined the comparative effectiveness of the three query-sensitive measures. We calculated interdocument relationships for varying numbers of top-ranked documents for six document collections. Our results show a consistent and significant increase in the number of relevant documents that become nearest neighbors of any given relevant document when query-sensitive measures are used. These results suggest that the effectiveness of a cluster-based information retrieval system has the potential to increase through the use of query-sensitive similarity measures.
引用
收藏
页码:617 / 642
页数:25
相关论文
共 50 条
  • [31] Query refinement for multimedia similarity retrieval in MARS
    Porkaew, K
    Chakrabarti, K
    Mehrotra, S
    ACM MULTIMEDIA 99, PROCEEDINGS, 1999, : 235 - 238
  • [32] Web video thumbnail recommendation with content-aware analysis and query-sensitive matching
    Weigang Zhang
    Chunxi Liu
    Zhenjun Wang
    Guorong Li
    Qingming Huang
    Wen Gao
    Multimedia Tools and Applications, 2014, 73 : 547 - 571
  • [33] Web video thumbnail recommendation with content-aware analysis and query-sensitive matching
    Zhang, Weigang
    Liu, Chunxi
    Wang, Zhenjun
    Li, Guorong
    Huang, Qingming
    Gao, Wen
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 73 (01) : 547 - 571
  • [34] ThumbReels: Query-Sensitive Web Video Previews Based on Temporal, Crowdsourced, Semantic Tagging
    Craggs, Barnaby
    Scott, Myles Kilgallon
    Alexander, Jason
    32ND ANNUAL ACM CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2014), 2014, : 1217 - 1220
  • [35] A Query-Sensitive Graph-Based Sentence Ranking Algorithm for Query-Oriented Multi-Document Summarization .
    Wei, Furu
    He, Yanxiang
    Li, Wenjie
    Lu, Qin
    2008 INTERNATIONAL SYMPOSIUM ON INFORMATION PROCESSING AND 2008 INTERNATIONAL PACIFIC WORKSHOP ON WEB MINING AND WEB-BASED APPLICATION, 2008, : 9 - +
  • [36] Evaluation and analysis of similarity measures for content-based visual information retrieval
    Horst Eidenberger
    Multimedia Systems, 2006, 12 : 71 - 87
  • [37] Evaluation and analysis of similarity measures for content-based visual information retrieval
    Eidenberger, Horst
    MULTIMEDIA SYSTEMS, 2006, 12 (02) : 71 - 87
  • [38] The effect of similarity measures on the quality of query clusters
    Fu, L
    Goh, DHL
    Foo, SSB
    JOURNAL OF INFORMATION SCIENCE, 2004, 30 (05) : 396 - 407
  • [39] Similarity measures for histological image retrieval
    Lam, RWK
    Ip, HHS
    Cheung, KKT
    15TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 2, PROCEEDINGS: PATTERN RECOGNITION AND NEURAL NETWORKS, 2000, : 295 - 298
  • [40] Evaluation of similarity measures for video retrieval
    Saddam Bekhet
    Amr Ahmed
    Multimedia Tools and Applications, 2020, 79 : 6265 - 6278