A Weighted Principal Component Analysis and Its Application to Gene Expression Data

被引:28
|
作者
da Costa, Joaquim F. Pinto [1 ,2 ]
Alonso, Hugo [3 ,4 ,5 ]
Roque, Luis [6 ]
机构
[1] Univ Porto, Fac Ciencias, Dept Matemat, P-4169007 Oporto, Portugal
[2] Univ Porto CMUP, Ctr Matemat, Oporto, Portugal
[3] Univ Lusofona Porto, Fac Econ & Gestao, P-4000098 Oporto, Portugal
[4] Univ Aveiro, Dept Matemat, P-3810193 Aveiro, Portugal
[5] Univ Aveiro, CIDMA, Aveiro, Portugal
[6] Inst Super Engn Porto, Grp Invest Engn Conhecimento & Apoio Decisao GECA, P-4200072 Oporto, Portugal
关键词
Correlation; principal component analysis; support vector machines; microarray data; gene selection; LYMPH-NODE METASTASIS; RANK MEASURE; CANCER; CLASSIFICATION; MICROARRAYS; CARCINOMAS; PROGNOSIS; CENTROIDS; SURVIVAL; MODELS;
D O I
10.1109/TCBB.2009.61
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In this work, we introduce in the first part new developments in Principal Component Analysis (PCA) and in the second part a new method to select variables (genes in our application). Our focus is on problems where the values taken by each variable do not all have the same importance and where the data may be contaminated with noise and contain outliers, as is the case with microarray data. The usual PCA is not appropriate to deal with this kind of problems. In this context, we propose the use of a new correlation coefficient as an alternative to Pearson's. This leads to a so-called weighted PCA (WPCA). In order to illustrate the features of our WPCA and compare it with the usual PCA, we consider the problem of analyzing gene expression data sets. In the second part of this work, we propose a new PCA-based algorithm to iteratively select the most important genes in a microarray data set. We show that this algorithm produces better results when our WPCA is used instead of the usual PCA. Furthermore, by using Support Vector Machines, we show that it can compete with the Significance Analysis of Microarrays algorithm.
引用
下载
收藏
页码:246 / 252
页数:7
相关论文
共 50 条
  • [31] Cox survival analysis of microarray gene expression data using correlation principal component regression
    Zhao, Qiang
    Sun, Jianguo
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2007, 6
  • [32] Use of principal component analysis and the GE-biplot for the graphical exploration of gene expression data
    Pittelkow, Y
    Wilson, SR
    BIOMETRICS, 2005, 61 (02) : 630 - 632
  • [33] Gene selection for microarray data analysis using principal component analysis
    Wang, AT
    Gehan, EA
    STATISTICS IN MEDICINE, 2005, 24 (13) : 2069 - 2087
  • [34] Block-Constraint Robust Principal Component Analysis and its Application to Integrated Analysis of TCGA Data
    Liu, Jin-Xing
    Gao, Ying-Lian
    Zheng, Chun-Hou
    Xu, Yong
    Yu, Jiguo
    IEEE TRANSACTIONS ON NANOBIOSCIENCE, 2016, 15 (06) : 510 - 516
  • [35] An Exploration of the Application of Principal Component Analysis in Big Data Processing
    Li G.
    Qin Y.
    Applied Mathematics and Nonlinear Sciences, 2024, 9 (01)
  • [36] Weighted principal component analysis: a weighted covariance eigendecomposition approach
    Delchambre, L.
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2015, 446 (04) : 3545 - 3555
  • [37] Multi-layer weighted grey principal component evaluation model and its application
    Wang L.-L.
    Fang Z.-G.
    Kongzhi yu Juece/Control and Decision, 2019, 34 (06): : 1300 - 1306
  • [38] Principal Component Analysis with Weighted Sparsity Constraint
    Duong, Thanh D. X.
    Duong, Vu N.
    APPLIED MATHEMATICS & INFORMATION SCIENCES, 2010, 4 (01): : 79 - 91
  • [39] Application of Principal Component Analysis to Lubricating Oil Spectral Data
    Tian, Hongxiang
    Liu, Tao
    2009 INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT, INNOVATION MANAGEMENT AND INDUSTRIAL ENGINEERING, VOL 1, PROCEEDINGS, 2009, : 286 - 289
  • [40] Application of Complex Hilbert Principal Component Analysis to Financial Data
    Souma, Wataru
    Iyetomi, Hiroshi
    Yoshikawa, Hiroshi
    2017 IEEE 41ST ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 2, 2017, : 391 - 394