A new weighting algorithm for linear classifier

被引:0
|
作者
Chen, KL [1 ]
Zong, CQ [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100080, Peoples R China
来源
2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS | 2003年
关键词
weighting algorithm; variance; text categorization;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the domain of text categorization (TC), the TF (term frequency)*IDF (inverse document frequency) weighting algorithm and TF*IWF*IWF weighting algorithm are widely used. However, the two algorithms are too biased by the term frequency and neglect the unbalance between classes. In this paper., we propose a new weighting algorithm, which is named as TF (term frequency) *IWF (inverse word frequency)*IWF (inverse word frequency)*VE (variance and expectation). The new algorithm improves the TF*IWF*IWF weighting algorithm in both TF and VE. This paper compares the new algorithm with TF*IWF*IWF algorithm respectively in theory and experiment. From the preliminary experiment, we find that the F1-Measure has been improved for 11.78%.
引用
收藏
页码:650 / 655
页数:6
相关论文
共 50 条
  • [31] ADAPTIVE LINEAR CLASSIFIER BY LINEAR PROGRAMMING
    IBARAKI, T
    MUROGA, S
    IEEE TRANSACTIONS ON SYSTEMS SCIENCE AND CYBERNETICS, 1970, SSC6 (01): : 53 - &
  • [32] A new association rule-based text classifier algorithm
    Buddeewong, S
    Kreesuradej, W
    ICTAI 2005: 17TH IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, : 684 - 685
  • [33] New Weighting Coefficients Algorithm of Weighted Geometric Means Combination Forecasting
    Liu Xinwei
    Xiao Yiping
    CONTEMPORARY INNOVATION AND DEVELOPMENT IN STATISTICAL SCIENCE, 2012, : 105 - 108
  • [34] Fuzzy stock selection using a new fuzzy ranking and weighting algorithm
    Tiryaki, F
    Ahlatcioglu, M
    APPLIED MATHEMATICS AND COMPUTATION, 2005, 170 (01) : 144 - 157
  • [35] A NEW ALGORITHM FOR LINEAR SYSTEM IDENTIFICATION
    SARIDIS, GN
    STEIN, G
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1968, AC13 (05) : 592 - &
  • [36] New linear algorithm for sequence analysis
    不详
    DR DOBBS JOURNAL, 2001, 26 (03): : 18 - 18
  • [37] A new algorithm on linear Diophantine equations
    Shi, Xiquan
    Liu, Fengshan
    Umoh, Hanson M.
    Gibson, Paul
    Hu, Zhitao
    DISCRETE AND COMPUTATIONAL MATHEMATICS, 2008, : 215 - +
  • [38] A better rim weighting algorithm
    Baxter, Michael
    INTERNATIONAL JOURNAL OF MARKET RESEARCH, 2016, 58 (04) : 621 - 634
  • [39] Integrating incremental feature weighting into Naive Bayes text classifier
    Kim, Han Joon
    Chang, Jaeyoung
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 1137 - 1143
  • [40] When a Constant Classifier is as Good as Any Linear Classifier
    Ellis, Steven P.
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2011, 40 (21) : 3800 - 3811