Improving Information Retrieval Through a Global Term Weighting Scheme

被引:0
|
作者
Cuellar, Daniel [1 ]
Diaz, Elva [1 ]
Ponce-de-Leon-Senti, Eunice [1 ]
机构
[1] UAA, Basic Sci Ctr, Dept Comp Sci, Aguascalientes 20131, Aguascalientes, Mexico
来源
关键词
Information retrieval; Indexing; Vector space model; Term weighting; Marginal distribution; Weighting scheme;
D O I
10.1007/978-3-319-19264-2_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The output of an information retrieval system is an ordered list of documents corresponding to the user query, represented by an input list of terms. This output relies on the estimated similarity between each document and the query. This similarity depends in turn on the weighting scheme used for the terms of the document index. Term weighting then plays a big role in the estimation of the aforementioned similarity. This paper proposes a new term weighting approach for information retrieval based on the marginal frequencies. Consisting of the global count of term frequencies over the corpus of documents, while conventional term weighting schemes such as the normalized term frequency takes into account the term frequencies for particular documents. The presented experiment shows the advantages and disadvantages of the proposed retrieval scheme. Performance measures such as precision and recall and F-Score are used over classical benchmarks such as CACM to validate the experimental results.
引用
收藏
页码:246 / 257
页数:12
相关论文
共 50 条
  • [41] Evolving general term-weighting schemes for information retrieval: Tests on larger collections
    Cummins, R
    O'riordan, C
    ARTIFICIAL INTELLIGENCE REVIEW, 2005, 24 (3-4) : 277 - 299
  • [42] A novel term weighting scheme based on discrimination power obtained from past retrieval results
    Song, Sa-kwang
    Myaeng, Sung Hyon
    INFORMATION PROCESSING & MANAGEMENT, 2012, 48 (05) : 921 - 932
  • [43] Improving information retrieval of subjects through citation-analysis
    Gabel, Jeff
    KNOWLEDGE ORGANIZATION, 2006, 33 (02): : 86 - 95
  • [44] An information retrieval model by using weighting technology
    Shi, CG
    Lu, J
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON INFORMATION AND MANAGEMENT SCIENCES, 2002, 2 : 427 - 430
  • [45] Rank by Readability: Document Weighting for Information Retrieval
    Newbold, Neil
    McLaughlin, Harry
    Gillam, Lee
    ADVANCES IN MULTIDISCIPLINARY RETRIEVAL, 2010, 6107 : 20 - 30
  • [46] Comparing weighting models for monolingual information retrieval
    Amati, G
    Carpineto, C
    Romano, G
    COMPARATIVE EVALUATION OF MULTILINGUAL INFORMATION ACCESS SYSTEMS, 2003, 3237 : 310 - 318
  • [47] Combining Global and Local Semantic Contexts for Improving Biomedical Information Retrieval
    Dinh, Duy
    Tamine, Lynda
    ADVANCES IN INFORMATION RETRIEVAL, 2011, 6611 : 375 - 386
  • [48] Efficiency Implications of Term Weighting for Passage Retrieval
    Mackenzie, Joel
    Dai, Zhuyun
    Gallagher, Luke
    Callan, Jamie
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1821 - 1824
  • [49] Supporting Text Retrieval by Typographical Term Weighting
    Werner, Lars
    Boettcher, Stefan
    INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2007, 3 (02) : 1 - 16
  • [50] Improving information retrieval system security via an optimal maximal coding scheme
    Long, DY
    EURASIA-ICT 2002: INFORMATION AND COMMUNICATION TECHNOLOGY, PROCEEDINGS, 2002, 2510 : 127 - 134