A method of dimensionality reduction by selection of components in principal component analysis for text classification

被引:5
|
作者
Zhang, Yangwu [1 ,2 ]
Li, Guohe [1 ,3 ]
Zong, Heng [2 ]
机构
[1] China Univ Petr, Coll Geophys & Informat Engn, Beijing, Peoples R China
[2] China Univ Polit Sci & Law, Dept Sci & Technol Teaching, Beijing, Peoples R China
[3] China Univ Petr, Beijing Key Lab Data Min Petr Data, Beijing, Peoples R China
关键词
Principal components analysis; Dimensionality reduction; Text classification;
D O I
10.2298/FIL1805499Z
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Dimensionality reduction, including feature extraction and selection, is one of the key points for text classification. In this paper, we propose a mixed method of dimensionality reduction constructed by principal components analysis and the selection of components. Principal components analysis is a method of feature extraction. Not all of the components in principal component analysis contribute to classification, because PCA objective is not a form of discriminant analysis (see, e.g. Jolliffe, 2002). In this context, we present a function of components selection, which returns the useful components for classification by the indicators of the performances on the different subsets of the components. Compared to traditional methods of feature selection, SVM classifiers trained on selected components show improved classification performance and a reduction in computational overhead.
引用
收藏
页码:1499 / 1506
页数:8
相关论文
共 50 条
  • [31] Hyperspectral Dimensionality Reduction Based on Multiscale Superpixelwise Kernel Principal Component Analysis
    Zhang, Lan
    Su, Hongjun
    Shen, Jingwei
    REMOTE SENSING, 2019, 11 (10)
  • [32] Efficient Dimensionality Reduction using Principal Component Analysis for Image Change Detection
    Martinez-Izquierdo, M.
    Molina-Sanchez, I.
    Morillo-Balsera, M.
    IEEE LATIN AMERICA TRANSACTIONS, 2019, 17 (04) : 540 - 547
  • [33] Generalized Spectral Dimensionality Reduction Based on Kernel Representations and Principal Component Analysis
    Ortega-Bustamante, Macarthur C.
    Hasperue, Waldo
    Peluffo-Ordonez, Diego H.
    Gonzalez-Vergara, Juan
    Marin-Gavino, Josue
    Velez-Falconi, Martin
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS, ICCSA 2021, PT IV, 2021, 12952 : 512 - 523
  • [34] Dimensionality Reduction of Spectral Reflectance by Dividing the Error Space of Principal Component Analysis
    Li, Junfeng
    Li, Miaoxin
    Cao, Qian
    Liu, Shiwei
    Wei, Chun'ao
    JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY, 2022, 66 (02)
  • [35] Robust multivariate L1 principal component analysis and dimensionality reduction
    Gao, Junbin
    Kwan, Paul W.
    Guo, Yi
    NEUROCOMPUTING, 2009, 72 (4-6) : 1242 - 1249
  • [36] FPGA implementation of the principal component analysis algorithm for dimensionality reduction of hyperspectral images
    Daniel Fernandez
    Carlos Gonzalez
    Daniel Mozos
    Sebastian Lopez
    Journal of Real-Time Image Processing, 2019, 16 : 1395 - 1406
  • [37] Feature selection for text data via sparse principal component analysis
    Son, Won
    KOREAN JOURNAL OF APPLIED STATISTICS, 2023, 36 (06) : 501 - 514
  • [38] Hyperspectral Image Classification Combining Superpixel Principal Component Analysis Dimensionality Reduction with Extended Random Walk Probability Optimization
    Dejia, Hu
    Yuan, Huang
    Bin, Yang
    Xinguang, He
    LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (12)
  • [39] BEYOND PRINCIPAL COMPONENT ANALYSIS - CANONICAL COMPONENT ANALYSIS FOR DATA REDUCTION IN CLASSIFICATION OF EPS
    VITRAI, J
    CZOBOR, P
    SIMON, G
    VARGA, L
    MAROSFI, S
    INTERNATIONAL JOURNAL OF BIO-MEDICAL COMPUTING, 1984, 15 (02): : 93 - 111
  • [40] An effective dimensionality reduction method for text classification based on TFP-tree
    Liu, Lu
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (03) : 1893 - 1905