Improving Information Retrieval Through a Global Term Weighting Scheme

被引:0
|
作者
Cuellar, Daniel [1 ]
Diaz, Elva [1 ]
Ponce-de-Leon-Senti, Eunice [1 ]
机构
[1] UAA, Basic Sci Ctr, Dept Comp Sci, Aguascalientes 20131, Aguascalientes, Mexico
来源
关键词
Information retrieval; Indexing; Vector space model; Term weighting; Marginal distribution; Weighting scheme;
D O I
10.1007/978-3-319-19264-2_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The output of an information retrieval system is an ordered list of documents corresponding to the user query, represented by an input list of terms. This output relies on the estimated similarity between each document and the query. This similarity depends in turn on the weighting scheme used for the terms of the document index. Term weighting then plays a big role in the estimation of the aforementioned similarity. This paper proposes a new term weighting approach for information retrieval based on the marginal frequencies. Consisting of the global count of term frequencies over the corpus of documents, while conventional term weighting schemes such as the normalized term frequency takes into account the term frequencies for particular documents. The presented experiment shows the advantages and disadvantages of the proposed retrieval scheme. Performance measures such as precision and recall and F-Score are used over classical benchmarks such as CACM to validate the experimental results.
引用
收藏
页码:246 / 257
页数:12
相关论文
共 50 条
  • [21] Structural Information Based Term Weighting in Text Retrieval for Feature Location
    Bassett, Blake
    Kraft, Nicholas A.
    2013 IEEE 21ST INTERNATIONAL CONFERENCE ON PROGRAM COMPREHENSION (ICPC), 2013, : 133 - 141
  • [22] A probabilistic justification for using tf × idf term weighting in information retrieval
    Hiemstra D.
    International Journal on Digital Libraries, 2000, 3 (2) : 131 - 139
  • [23] Effects of central tendency measures on term weighting in textual information retrieval
    Ghahramani, Farzad
    Tahayori, Hooman
    Visconti, Andrea
    SOFT COMPUTING, 2021, 25 (11) : 7341 - 7378
  • [24] A New Weighting Scheme and Discriminative Approach for Information Retrieval in Static and Dynamic Document Collections
    Ibrahim, Osman A. S.
    Landa-Silva, Dario
    2014 14TH UK WORKSHOP ON COMPUTATIONAL INTELLIGENCE (UKCI), 2014, : 65 - 72
  • [25] Evolved term-weighting schemes in Information Retrieval: an analysis of the solution space
    Cummins, Ronan
    O'Riordan, Colm
    ARTIFICIAL INTELLIGENCE REVIEW, 2006, 26 (1-2) : 35 - 47
  • [26] Contextual proximity based term-weighting for improved web information retrieval
    Bhatia, M. P. S.
    Khalid, Akshi Kumar
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2007, 4798 : 267 - +
  • [27] Evolved term-weighting schemes in Information Retrieval: an analysis of the solution space
    Ronan Cummins
    Colm O’Riordan
    Artificial Intelligence Review, 2006, 26 : 35 - 47
  • [28] An effective term weighting method using random walk model for information retrieval
    Islam, Md. Rafiqul
    Sarker, Buddha Dev
    Islam, Md. Rakibul
    2008 INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION ENGINEERING, VOLS 1-3, 2008, : 1357 - 1362
  • [29] Semi-parametric and Non-parametric Term Weighting for Information Retrieval
    Metzler, Donald
    Zaragoza, Hugo
    ADVANCES IN INFORMATION RETRIEVAL THEORY, 2009, 5766 : 42 - 53
  • [30] Effective Term Weighting for Sentence Retrieval
    Momtazi, Saeedeh
    Lease, Matthew
    Klakow, Dietrich
    RESEARCH AND ADVANCED TECHNOLOGY FOR DIGITAL LIBRARIES, 2010, 6273 : 482 - +