Improvement of Text Feature Selection Method based on TFIDF

被引:17
|
作者
Qu, Shouning [1 ]
Wang, Sujuan [1 ]
Zou, Yan [1 ]
机构
[1] Univ Jinan, Sch Informat Sci & Engn, Jinan 250022, Shandong, Peoples R China
关键词
D O I
10.1109/FITME.2008.25
中图分类号
F [经济];
学科分类号
02 ;
摘要
TFIDF is a kind of common methods used to select the text feature, but it has many disadvantages. First, the method undervalues that this term can represent the characteristic of the documents of this class if it only frequently appears in the documents belongs to the same class while infrequently in the documents of the other class. Second TFIDF neglects the relations between the feature and the class. The paper proposed the improved TFIDF strategy, and combined with the text classification method of simple distance vector to compare to traditional TFIDF, and obtained the very good classified effect, the experiment proved its feasibility.
引用
收藏
页码:79 / 81
页数:3
相关论文
共 50 条
  • [31] Improved Algorithm Based on TFIDF in Text Classification
    Jiang, Hao
    Li, Wenqiang
    MEMS, NANO AND SMART SYSTEMS, PTS 1-6, 2012, 403-408 : 1791 - 1794
  • [32] A hybrid feature selection method for text classification using a feature-correlation-based genetic algorithmA hybrid feature selection method for text classification...L. Farek, A. Benaidja
    Lazhar Farek
    Amira Benaidja
    Soft Computing, 2024, 28 (23) : 13567 - 13593
  • [33] Hybrid Support Vector Machine based Feature Selection Method for Text Classification
    Sabbah, Thabit
    Ayyash, Mosab
    Ashraf, Mahmood
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2018, 15 (3A) : 599 - 609
  • [34] A Chi-square Statistics Based Feature Selection Method in Text Classification
    Zhai, Yujia
    Song, Wei
    Liu, Xianjun
    Liu, Lizhen
    Zhao, Xinlei
    PROCEEDINGS OF 2018 IEEE 9TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2018, : 160 - 163
  • [35] A Method of Feature Selection Based on Word2Vec in Text Categorization
    Tian, Wenfeng
    Li, Jun
    Li, Hongguang
    2018 37TH CHINESE CONTROL CONFERENCE (CCC), 2018, : 9452 - 9455
  • [36] A new unsupervised feature selection method for text clustering based on genetic algorithms
    Pirooz Shamsinejadbabki
    Mohammad Saraee
    Journal of Intelligent Information Systems, 2012, 38 : 669 - 684
  • [37] A new Chinese text feature selection method in centroid-based classifier
    Gu, Yijun
    Wang, Rong
    Wang, Jianhua
    Yu, Jiangde
    2008 INTERNATIONAL SYMPOSIUM ON INFORMATION PROCESSING AND 2008 INTERNATIONAL PACIFIC WORKSHOP ON WEB MINING AND WEB-BASED APPLICATION, 2008, : 88 - +
  • [38] A new feature selection method based on frequent and associated itemsets for text classification
    Farghaly, Heba Mamdouh
    Abd El-Hafeez, Tarek
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (25):
  • [39] CHI statistical text feature selection method based on information entropy optimization
    Wu G.
    Li S.
    Han L.
    Zhao M.
    International Journal of Database Theory and Application, 2016, 9 (11): : 61 - 70
  • [40] Feature Selection Method Based On Statistics of Compound Words for Arabic Text Classification
    Adel, Aisha
    Omar, Nazlia
    Albared, Mohammed
    Al-Shabi, Adel
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2019, 16 (02) : 178 - 185