A retrospective study of a hybrid document-context based retrieval model

被引:26
|
作者
Wu, H. C. [1 ]
Luk, Robert W. P.
Wong, K. F.
Kwok, K. L.
机构
[1] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
[2] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R China
[3] CUNY Queens Coll, Dept Comp Sci, Flushing, NY 11367 USA
关键词
information retrieval; model; theory; retrospective experiment;
D O I
10.1016/j.ipm.2006.10.009
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper describes our novel retrieval model that is based on contexts of query terms in documents (i.e., document contexts). Our model is novel because it explicitly takes into account of the document contexts instead of implicitly using the document contexts to find query expansion terms. Our model is based on simulating a user making relevance decisions, and it is a hybrid of various existing effective models and techniques. It estimates the relevance decision preference of a document context as the log-odds and uses smoothing techniques as found in language models to solve the problem of zero probabilities. It combines these estimated preferences of document contexts using different types of aggregation operators that comply with different relevance decision principles (e.g., aggregate relevance principle). Our model is evaluated using retrospective experiments (i.e.,, with full relevance information), because such experiments can (a) reveal the potential of our model, (b) isolate the problems of the model from those of the parameter estimation, (c) provide information about the major factors affecting the retrieval effectiveness of the model, and (d) show that whether the model obeys the probability ranking principle. Our model is promising as its mean average precision is 60-80% in our experiments using different TREC ad hoc English collections and the NTCIR-5 ad hoc Chinese collection. Our experiments showed that (a) the operators that are consistent with aggregate relevance principle were effective in combining the estimated preferences, and (b) that estimating probabilities using the contexts in the relevant documents can produce better retrieval effectiveness than using the entire relevant documents. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1308 / 1331
页数:24
相关论文
共 50 条
  • [31] Topic Model based Approach for Improved Indexing in Content based Document Retrieval
    Cha, Moon Soo
    Kim, So Yeon
    Ha, Jae Hee
    Lee, Min-June
    Choi, Young-June
    Sohn, Kyung-Ah
    INTERNATIONAL JOURNAL OF NETWORKED AND DISTRIBUTED COMPUTING, 2016, 4 (01) : 55 - 64
  • [32] Study of ontology or thesaurus based document clustering and information retrieval
    Bharathi, G.
    Venkatesan, D.
    Journal of Theoretical and Applied Information Technology, 2012, 40 (01) : 55 - 61
  • [33] An LDA-smoothed Relevance Model for Document Expansion: A Case Study for Spoken Document Retrieval
    Ganguly, Debasis
    Leveling, Johannes
    Jones, Gareth J. F.
    SIGIR'13: THE PROCEEDINGS OF THE 36TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH & DEVELOPMENT IN INFORMATION RETRIEVAL, 2013, : 1057 - 1060
  • [34] Cluster-based Language Model for Spoken Document Retrieval Using NMF-Based Document Clustering
    Hu, Xinhui
    Isotani, Ryosuke
    Kawai, Hisashi
    Nakamura, Satoshi
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 705 - 708
  • [35] Exploring LDA-Based Document Model for Geographic Information Retrieval
    Li, Zhisheng
    Wang, Chong
    Xie, Xing
    Wang, Xufa
    Ma, Wei-Ying
    ADVANCES IN MULTILINGUAL AND MULTIMODAL INFORMATION RETRIEVAL, 2008, 5152 : 842 - +
  • [36] Leveraging concepts and semantic relationships for language model based document retrieval
    Lhadj, Lynda Said
    Amrouche, Karima
    Boughanem, Mohand
    1600, Springer Verlag (8748): : 100 - 112
  • [37] Some issues in efficient implementation of a vector based model for document retrieval
    Angeli, P
    Basset, O
    Fulton, C
    Howell, G
    Hsu, R
    Sawetprawhichkal, A
    Schuster, M
    Thompson, H
    Wilberscheid, S
    ISE'2001: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON INFORMATION SYSTEMS AND ENGINEERING, 2001, : 241 - 246
  • [38] Leveraging Concepts and Semantic Relationships for Language Model Based Document Retrieval
    Lhadj, Lynda Said
    Boughanem, Mohand
    Amrouche, Karima
    MODEL AND DATA ENGINEERING, MEDI 2014, 2014, 8748 : 100 - 112
  • [39] A Context-Based Word Indexing Model for Document Summarization
    Goyal, Pawan
    Behera, Laxmidhar
    McGinnity, Thomas Martin
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2013, 25 (08) : 1693 - 1705
  • [40] Document Expansion Based on Clique for Markov Network Information Retrieval Model
    Gan, Lixin
    Tu, Wei
    2013 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2013, : 392 - 395