A method of dimensionality reduction by selection of components in principal component analysis for text classification

被引:5
|
作者
Zhang, Yangwu [1 ,2 ]
Li, Guohe [1 ,3 ]
Zong, Heng [2 ]
机构
[1] China Univ Petr, Coll Geophys & Informat Engn, Beijing, Peoples R China
[2] China Univ Polit Sci & Law, Dept Sci & Technol Teaching, Beijing, Peoples R China
[3] China Univ Petr, Beijing Key Lab Data Min Petr Data, Beijing, Peoples R China
关键词
Principal components analysis; Dimensionality reduction; Text classification;
D O I
10.2298/FIL1805499Z
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Dimensionality reduction, including feature extraction and selection, is one of the key points for text classification. In this paper, we propose a mixed method of dimensionality reduction constructed by principal components analysis and the selection of components. Principal components analysis is a method of feature extraction. Not all of the components in principal component analysis contribute to classification, because PCA objective is not a form of discriminant analysis (see, e.g. Jolliffe, 2002). In this context, we present a function of components selection, which returns the useful components for classification by the indicators of the performances on the different subsets of the components. Compared to traditional methods of feature selection, SVM classifiers trained on selected components show improved classification performance and a reduction in computational overhead.
引用
收藏
页码:1499 / 1506
页数:8
相关论文
共 50 条
  • [1] Dimensionality reduction and visualization in principal component analysis
    Ivosev, Gordana
    Burton, Lyle
    Bonner, Ron
    ANALYTICAL CHEMISTRY, 2008, 80 (13) : 4933 - 4944
  • [2] Least squares regression principal component analysis: A supervised dimensionality reduction method
    Pascual, Hector
    Yee, Xin C.
    NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2022, 29 (01)
  • [3] Dimensionality Reduction with Sparse Locality for Principal Component Analysis
    Li, Pei Heng
    Lee, Taeho
    Youn, Hee Yong
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2020, 2020
  • [4] Adaptive Dimensionality Reduction for Local Principal Component Analysis
    Migenda, Nico
    Schenck, Wolfram
    2020 25TH IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA), 2020, : 1575 - 1582
  • [5] Joint Principal Component and Discriminant Analysis for Dimensionality Reduction
    Zhao, Xiaowei
    Guo, Jun
    Nie, Feiping
    Chen, Ling
    Li, Zhihui
    Zhang, Huaxiang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (02) : 433 - 444
  • [6] Dimensionality reduction in text classification using scatter method
    Saarikoski, Jyri
    Laurikkala, Jorma
    Jarvelin, Kalervo
    Siermala, Markku
    Juhola, Martti
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2014, 6 (01) : 1 - 21
  • [7] Principal Component Analysis based on data characteristics for dimensionality reduction of ECG recordings in arrhythmia classification
    Wosiak, Agnieszka
    OPEN PHYSICS, 2019, 17 (01): : 489 - 496
  • [8] Reducing Dimensionality in Principal Component Analysis – A Method Comparison
    Z. Kánya
    E. Forgács
    T. Cserháti
    Z. Illés
    Chromatographia, 2006, 63 : 129 - 134
  • [9] Reducing dimensionality in principal component analysis -: A method comparison
    Kánya, Z
    Forgács, E
    Cserháti, T
    Illés, Z
    CHROMATOGRAPHIA, 2006, 63 (3-4) : 129 - 134
  • [10] Dimensionality Reduction Using Principal Component Analysis Applied to the Gradient
    Berguin, Steven H.
    Mavris, Dimitri N.
    AIAA JOURNAL, 2015, 53 (04) : 1078 - 1090