Application of an Improved CHI Feature Selection Algorithm

被引:9
|
作者
Cai, Liang-jing [1 ]
Lv, Shu [1 ]
Shi, Kai-bo [2 ]
机构
[1] Univ Elect Sci & Technol China, Sch Math Sci, Chengdu 611731, Sichuan, Peoples R China
[2] Chengdu Univ, Sch Elect Informat & Elect Engn, Chengdu 610106, Sichuan, Peoples R China
关键词
D O I
10.1155/2021/9963382
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Text classification is the critical content of machine learning, and it is widely applied in information filtering, sentimental analysis, and text review. It is very important to improve the accuracy of classification results, and this is also the main research purpose of researchers in this field in recent years. Feature selection plays an important role in text classification, which has the functions of eliminating irrelevant features, reducing dimensionality, and improving classification accuracy. So, this paper studies the CHI feature selection algorithm, and the main work and innovations are as follows: firstly, this paper analyzed the CHI algorithm's flaws, determined that the introduction of new parameters will be the improvement direction of the CHI algorithm, and thus proposed a new algorithm based on variance and coefficient of variation. Secondly, experiment to verify the effectiveness of the new algorithm. In terms of language, the experiment in this paper includes two text classification systems, which were Chinese and English. In terms of classifiers, two classifier algorithms were used, which included the KNN classifier and the Naive Bayes classifier. In terms of data types, two distribution types of data were used: balanced datasets and unbalanced datasets. Finally, experiment and result analysis. This paper has conducted 3 comparative experiments and analyzed the results of each experiment. The experimental results obtained are all significantly improved compared to the results before the improvement.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] An Improved Whale Optimization Algorithm for Feature Selection
    Guo, Wenyan
    Liu, Ting
    Dai, Fang
    Xu, Peng
    CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 62 (01): : 337 - 354
  • [22] Novel feature selection algorithm for Chinese text categorization based on CHI
    Cai Zhenliang
    Wang Jian
    Liu Jiqiang
    PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 1035 - 1039
  • [23] IMPROVED FORWARD FLOATING SELECTION ALGORITHM FOR FEATURE SUBSET SELECTION
    Nakariyakul, Songyot
    Casasent, David P.
    PROCEEDINGS OF 2008 INTERNATIONAL CONFERENCE ON WAVELET ANALYSIS AND PATTERN RECOGNITION, VOLS 1 AND 2, 2008, : 793 - +
  • [24] Chaotic dragonfly algorithm: an improved metaheuristic algorithm for feature selection
    Sayed, Gehad Ismail
    Tharwat, Alaa
    Hassanien, Aboul Ella
    APPLIED INTELLIGENCE, 2019, 49 (01) : 188 - 205
  • [25] Chaotic dragonfly algorithm: an improved metaheuristic algorithm for feature selection
    Gehad Ismail Sayed
    Alaa Tharwat
    Aboul Ella Hassanien
    Applied Intelligence, 2019, 49 : 188 - 205
  • [26] Research and Application of Improved K-means Algorithm Based on Fuzzy Feature Selection
    Li, Xiuyun
    Yang, Jie
    Wang, Qing
    Fan, Jinjin
    Liu, Peng
    FIFTH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, VOL 1, PROCEEDINGS, 2008, : 401 - 405
  • [27] Improved WOA and its application in feature selection
    Liu, Wei
    Guo, Zhiqing
    Jiang, Feng
    Liu, Guangwei
    Wang, Dong
    Ni, Zishun
    PLOS ONE, 2022, 17 (05):
  • [28] An Improved Fuzzy Feature Clustering and Selection based on Chi-Squared-Test
    Chitsaz, Elham
    Taheri, Mohammad
    Katebi, Seraj D.
    Jahromi, Mansour Zolghadri
    IMECS 2009: INTERNATIONAL MULTI-CONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II, 2009, : 35 - 40
  • [29] Feature selection using an improved Chi-square for Arabic text classification
    Bahassine, Said
    Madani, Abdellah
    Al-Sarem, Mohammed
    Kissi, Mohamed
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2020, 32 (02) : 225 - 231
  • [30] A feature subset selection algorithm based on feature activity and improved GA
    Li, Juan
    2015 11TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2015, : 206 - 210