Improvement of Text Feature Selection Method based on TFIDF

被引:17
|
作者
Qu, Shouning [1 ]
Wang, Sujuan [1 ]
Zou, Yan [1 ]
机构
[1] Univ Jinan, Sch Informat Sci & Engn, Jinan 250022, Shandong, Peoples R China
关键词
D O I
10.1109/FITME.2008.25
中图分类号
F [经济];
学科分类号
02 ;
摘要
TFIDF is a kind of common methods used to select the text feature, but it has many disadvantages. First, the method undervalues that this term can represent the characteristic of the documents of this class if it only frequently appears in the documents belongs to the same class while infrequently in the documents of the other class. Second TFIDF neglects the relations between the feature and the class. The paper proposed the improved TFIDF strategy, and combined with the text classification method of simple distance vector to compare to traditional TFIDF, and obtained the very good classified effect, the experiment proved its feasibility.
引用
收藏
页码:79 / 81
页数:3
相关论文
共 50 条
  • [1] A TEXT FEATURE SELECTION METHOD USING TFIDF BASED ON ENTROPY
    Song, Jiang
    Xu, Min
    Fan, Chuyi
    COMPUTATIONAL INTELLIGENCE: FOUNDATIONS AND APPLICATIONS: PROCEEDINGS OF THE 9TH INTERNATIONAL FLINS CONFERENCE, 2010, 4 : 962 - 967
  • [2] A Text Feature Selection Algorithm Based on Improved TFIDF
    Chengcheng Yang
    Xingshi He
    PROCEEDINGS OF THE 2008 CHINESE CONFERENCE ON PATTERN RECOGNITION (CCPR 2008), 2008, : 416 - 419
  • [3] A Feature Selection Method based on Improved TFIDF
    Wei Yong-qing
    Liu Pei-yu
    Zhu Zhen-fang
    2008 3RD INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND APPLICATIONS, VOLS 1 AND 2, 2008, : 94 - +
  • [4] Improvement and application of TFIDF method based on text classification
    Zhang, Yufang
    Peng, Shiming
    Lv, Jia
    Jisuanji Gongcheng/Computer Engineering, 2006, 32 (19): : 76 - 78
  • [5] Improved feature selection approach TFIDF in text mining
    Jing, LP
    Huang, HK
    Shi, HB
    2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 944 - 946
  • [6] Comparison and Improvement of feature selection method for text categorization
    Shan, Li-Li
    Liu, Bing-Quan
    Sun, Cheng-Jie
    Harbin Gongye Daxue Xuebao/Journal of Harbin Institute of Technology, 2011, 43 (SUPPL. 1): : 319 - 324
  • [7] A novel feature selection algorithm for text classification based on TFIDF-weight and KL-divergence
    Wang, BY
    Zhang, SM
    Proceedings of the 11th Joint International Computer Conference, 2005, : 438 - 441
  • [8] Research and Improvement of feature words weight based on TFIDF Algorithm
    Guo, Aizhang
    Yang, Tao
    2016 IEEE INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC), 2016, : 415 - 419
  • [9] An improved TFIDF feature selection algorithm based on information entropy
    Zhou Yantao
    Tang Jianbo
    Wang Jiaqin
    PROCEEDINGS OF THE 26TH CHINESE CONTROL CONFERENCE, VOL 5, 2007, : 312 - +
  • [10] TFIDF based Feature Words Extraction and Topic Modeling for Short Text
    Zhao, Guifen
    Liu, Yanjun
    Zhang, Wei
    Wang, Yiou
    PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON MANAGEMENT ENGINEERING, SOFTWARE ENGINEERING AND SERVICE SCIENCES (ICMSS 2018), 2018, : 188 - 191