A survey on dimension reduction techniques in text classification

被引:0
|
作者
Wang, Zhi Juan [1 ,2 ]
Zhou, Ruo Song [1 ]
机构
[1] Minzu Univ China, Coll Informat Engn, Beijing, Peoples R China
[2] Natl Language Resource Monitoring & Res Ctr, Minor Languages Branch, Beijing, Peoples R China
关键词
Dimension reduction; Text classification; Feature selection; Feature extraction; CDF;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Dimension reduction is one of the key points for text classification. Feature selection and feature extraction are the two common methods of dimension reduction. In this paper, we mainly discussed some dimension reduction techniques from two aspects including traditional methods (Information Gain, Mutual Information, Document Frequency, Correlation Coefficient) and new methods (Optimization Mutual Information Based on Word Frequency, CDF (Concentration, Dispersion and Frequency), Semantic Relatedness). Then analyzed the principle of these methods and illustrated their advantages as well as disadvantages.
引用
收藏
页码:633 / 635
页数:3
相关论文
共 50 条
  • [1] Effective Dimension Reduction Techniques for Text Documents
    Ponmuthuramalingam, P.
    Devi, T.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2010, 10 (07): : 101 - 109
  • [2] Enhancing Effectiveness of Dimension Reduction in Text Classification
    Seyyedi, Seyyed Hossein
    Minaei-Bidgoli, Behrouz
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2017, 26 (03)
  • [3] A SURVEY ON CLASSIFICATION TECHNIQUES FOR TEXT MINING
    Brindha, S.
    Sukumaran, S.
    Prabha, K.
    [J]. 2016 3RD INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATION SYSTEMS (ICACCS), 2016,
  • [4] Text Document Preprocessing and Dimension Reduction Techniques for Text Document Clustering
    Kadhim, Ammar Ismael
    Cheah, Yu-N
    Ahamed, Nurul Hashimah
    [J]. PROCEEDINGS 2014 4TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE WITH APPLICATIONS IN ENGINEERING AND TECHNOLOGY ICAIET 2014, 2014, : 69 - 73
  • [6] Dimension reduction in text classification with support vector machines
    Kim, H
    Howland, P
    Park, H
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2005, 6 : 37 - 53
  • [7] Toward a Quantitative Survey of Dimension Reduction Techniques
    Espadoto, Mateus
    Martins, Rafael M.
    Kerren, Andreas
    Hirata, Nina S. T.
    Telea, Alexandru C.
    [J]. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2021, 27 (03) : 2153 - 2173
  • [8] Dimensionality Reduction for Classification Comparison of Techniques and Dimension Choice
    Plastria, Frank
    De Bruyne, Steven
    Carrizosa, Emilio
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 411 - +
  • [9] Dimension reduction techniques and the classification of bent double galaxies
    Fodor, IK
    Kamath, C
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2002, 41 (01) : 91 - 122
  • [10] A Survey on Text Classification Techniques for Sentiment Polarity Detection
    Arunachalam, N.
    Sneka, Josephine S.
    MadhuMathi, G.
    [J]. 2017 INNOVATIONS IN POWER AND ADVANCED COMPUTING TECHNOLOGIES (I-PACT), 2017,