A nonparametric term weighting method for information retrieval based on measuring the divergence from independence

被引:0
|
作者
İlker Kocabaş
Bekir Taner Dinçer
Bahar Karaoğlan
机构
[1] Ege University,International Computer Institute
[2] Muğla University,Department of Statistics
[3] Muğla University,Department of Computer Engineering
来源
Information Retrieval | 2014年 / 17卷
关键词
Information retrieval; Nonparametric index term weighting; Statistical dependence; Pearson’s Chi-Square statistics;
D O I
暂无
中图分类号
学科分类号
摘要
In this article, we introduce an out-of-the-box automatic term weighting method for information retrieval. The method is based on measuring the degree of divergence from independence of terms from documents in terms of their frequency of occurrence. Divergence from independence has a well-establish underling statistical theory. It provides a plain, mathematically tractable, and nonparametric way of term weighting, and even more it requires no term frequency normalization. Besides its sound theoretical background, the results of the experiments performed on TREC test collections show that its performance is comparable to that of the state-of-the-art term weighting methods in general. It is a simple but powerful baseline alternative to the state-of-the-art methods with its theoretical and practical aspects.
引用
收藏
页码:153 / 176
页数:23
相关论文
共 50 条
  • [1] A nonparametric term weighting method for information retrieval based on measuring the divergence from independence
    Kocabas, Ilker
    Dincer, Bekir Taner
    Karaoglan, Bahar
    INFORMATION RETRIEVAL, 2014, 17 (02): : 153 - 176
  • [2] Probabilistic models of information retrieval based on measuring the divergence from randomness
    Amati, G
    Van Rijsbergen, CJ
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (04) : 357 - 389
  • [3] Graph-based term weighting for information retrieval
    Blanco, Roi
    Lioma, Christina
    INFORMATION RETRIEVAL, 2012, 15 (01): : 54 - 92
  • [4] Part of Speech Based Term Weighting for Information Retrieval
    Lioma, Christina
    Blanco, Roi
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2009, 5478 : 412 - +
  • [5] Graph-based term weighting for information retrieval
    Roi Blanco
    Christina Lioma
    Information Retrieval, 2012, 15 : 54 - 92
  • [6] Term weighting for information retrieval based on term's discrimination power
    Li, Qing
    Lee, Seungwoo
    Jung, Hanmin
    Lee, Yeong Su
    Cho, Jae-Hyun
    Song, Sa-kwang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2014, 71 (02) : 769 - 781
  • [7] Term weighting for information retrieval based on term’s discrimination power
    Qing Li
    Seungwoo Lee
    Hanmin Jung
    Yeong Su Lee
    Jae-Hyun Cho
    Sa-kwang Song
    Multimedia Tools and Applications, 2014, 71 : 769 - 781
  • [8] Query Aspect Based Term Weighting Regularization in Information Retrieval
    Zheng, Wei
    Fang, Hui
    ADVANCES IN INFORMATION RETRIEVAL, PROCEEDINGS, 2010, 5993 : 344 - 356
  • [9] Concept-based term weighting for web information retrieval
    Zakos, J
    Verma, B
    ICCIMA 2005: SIXTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, PROCEEDINGS, 2005, : 173 - 178
  • [10] CONCEPT-BASED TERM WEIGHTING FOR WEB INFORMATION RETRIEVAL
    Zakos, John
    Verma, Brijesh
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2006, 6 (02) : 193 - 207