A retrospective study of a hybrid document-context based retrieval model

被引:26
|
作者
Wu, H. C. [1 ]
Luk, Robert W. P.
Wong, K. F.
Kwok, K. L.
机构
[1] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
[2] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R China
[3] CUNY Queens Coll, Dept Comp Sci, Flushing, NY 11367 USA
关键词
information retrieval; model; theory; retrospective experiment;
D O I
10.1016/j.ipm.2006.10.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes our novel retrieval model that is based on contexts of query terms in documents (i.e., document contexts). Our model is novel because it explicitly takes into account of the document contexts instead of implicitly using the document contexts to find query expansion terms. Our model is based on simulating a user making relevance decisions, and it is a hybrid of various existing effective models and techniques. It estimates the relevance decision preference of a document context as the log-odds and uses smoothing techniques as found in language models to solve the problem of zero probabilities. It combines these estimated preferences of document contexts using different types of aggregation operators that comply with different relevance decision principles (e.g., aggregate relevance principle). Our model is evaluated using retrospective experiments (i.e.,, with full relevance information), because such experiments can (a) reveal the potential of our model, (b) isolate the problems of the model from those of the parameter estimation, (c) provide information about the major factors affecting the retrieval effectiveness of the model, and (d) show that whether the model obeys the probability ranking principle. Our model is promising as its mean average precision is 60-80% in our experiments using different TREC ad hoc English collections and the NTCIR-5 ad hoc Chinese collection. Our experiments showed that (a) the operators that are consistent with aggregate relevance principle were effective in combining the estimated preferences, and (b) that estimating probabilities using the contexts in the relevant documents can produce better retrieval effectiveness than using the entire relevant documents. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1308 / 1331
页数:24
相关论文
共 50 条
  • [21] A New Retrieval Model Based on TextTiling for Document Similarity Search
    Xiao-Jun Wan
    Yu-Xin Peng
    Journal of Computer Science and Technology, 2005, 20 : 552 - 558
  • [22] Research on Personalized Document Retrieval based on User Interest Model
    Xu Li
    Yang WeiZhong
    PROCEEDINGS OF 2012 7TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE & EDUCATION, VOLS I-VI, 2012, : 1771 - 1773
  • [23] Context Vector Model for Document Representation: A Computational Study
    Wei, Yang
    Wei, Jinmao
    Xu, Hengpeng
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2015, 2015, 9362 : 194 - 206
  • [24] A novel neighborhood based document smoothing model for information retrieval
    Goyal, Pawan
    Behera, Laxmidhar
    McGinnity, T. M.
    INFORMATION RETRIEVAL, 2013, 16 (03): : 391 - 425
  • [25] PARM: A Paragraph Aggregation Retrieval Model for Dense Document-to-Document Retrieval
    Althammer, Sophia
    Hofsfaetter, Sebastian
    Sertkan, Mete
    Verberne, Suzan
    Hanbury, Allan
    ADVANCES IN INFORMATION RETRIEVAL, PT I, 2022, 13185 : 19 - 34
  • [26] Document/query expansion based on selecting significant concepts for context based retrieval of medical images
    Torjmen-Khemakhem, Mouna
    Gasmi, Karim
    JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 95
  • [27] Embedding a Microblog Context in Ephemeral Queries for Document Retrieval
    Sethi, Shilpa
    JOURNAL OF WEB ENGINEERING, 2023, 22 (04): : 679 - 700
  • [28] Analysis of Probabilistic model for Document Retrieval in Information Retrieval
    Tamrakar, Astha
    Vishwakarma, Santosh K.
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 760 - 765
  • [29] A Context Sensitive Document Indexing Approach for Information Retrieval
    Vanishree, M.
    Sudha, R.
    2014 INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND SIGNAL PROCESSING (ICCSP), 2014,
  • [30] Representation Sparsification with Hybrid Thresholding for Fast SPLADE-based Document Retrieval
    Qiao, Yifan
    Yang, Yingrui
    He, Shanxiu
    Yang, Tao
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 2329 - 2333