A Weighted Principal Component Analysis and Its Application to Gene Expression Data

被引:28
|
作者
da Costa, Joaquim F. Pinto [1 ,2 ]
Alonso, Hugo [3 ,4 ,5 ]
Roque, Luis [6 ]
机构
[1] Univ Porto, Fac Ciencias, Dept Matemat, P-4169007 Oporto, Portugal
[2] Univ Porto CMUP, Ctr Matemat, Oporto, Portugal
[3] Univ Lusofona Porto, Fac Econ & Gestao, P-4000098 Oporto, Portugal
[4] Univ Aveiro, Dept Matemat, P-3810193 Aveiro, Portugal
[5] Univ Aveiro, CIDMA, Aveiro, Portugal
[6] Inst Super Engn Porto, Grp Invest Engn Conhecimento & Apoio Decisao GECA, P-4200072 Oporto, Portugal
关键词
Correlation; principal component analysis; support vector machines; microarray data; gene selection; LYMPH-NODE METASTASIS; RANK MEASURE; CANCER; CLASSIFICATION; MICROARRAYS; CARCINOMAS; PROGNOSIS; CENTROIDS; SURVIVAL; MODELS;
D O I
10.1109/TCBB.2009.61
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
In this work, we introduce in the first part new developments in Principal Component Analysis (PCA) and in the second part a new method to select variables (genes in our application). Our focus is on problems where the values taken by each variable do not all have the same importance and where the data may be contaminated with noise and contain outliers, as is the case with microarray data. The usual PCA is not appropriate to deal with this kind of problems. In this context, we propose the use of a new correlation coefficient as an alternative to Pearson's. This leads to a so-called weighted PCA (WPCA). In order to illustrate the features of our WPCA and compare it with the usual PCA, we consider the problem of analyzing gene expression data sets. In the second part of this work, we propose a new PCA-based algorithm to iteratively select the most important genes in a microarray data set. We show that this algorithm produces better results when our WPCA is used instead of the usual PCA. Furthermore, by using Support Vector Machines, we show that it can compete with the Significance Analysis of Microarrays algorithm.
引用
下载
收藏
页码:246 / 252
页数:7
相关论文
共 50 条
  • [41] WEIGHTED GRID PRINCIPAL COMPONENT ANALYSIS HASHING
    Zhou, Xiancheng
    Huang, Zhiqian
    Ng, Wing W. Y.
    PROCEEDINGS OF 2014 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2014, : 200 - 205
  • [42] The application of principal component analysis to drug discovery and biomedical data
    Giuliani, Alessandro
    DRUG DISCOVERY TODAY, 2017, 22 (07) : 1069 - 1076
  • [43] DERIVATION OF EIGENTRIPHONES BY WEIGHTED PRINCIPAL COMPONENT ANALYSIS
    Ko, Tom
    Mak, Brian
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4097 - 4100
  • [44] APPLICATION OF PRINCIPAL COMPONENT ANALYSIS TO THE INTERPRETATION OF RAINWATER COMPOSITIONAL DATA
    ZHANG, PX
    DUDLEY, N
    URE, AM
    LITTLEJOHN, D
    ANALYTICA CHIMICA ACTA, 1992, 258 (01) : 1 - 10
  • [46] ADAPTIVE WEIGHTED SPARSE PRINCIPAL COMPONENT ANALYSIS
    Yi, Shuangyan
    Liang, Yongsheng
    Liu, Wei
    Meng, Fanyang
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [47] Adaptive Weighted Robust Principal Component Analysis
    Xu, Zhengqin
    Lu, Yang
    Wu, Jiaxing
    He, Rui
    Wu, Shiqian
    Xie, Shoulie
    PROCEEDINGS OF THE 15TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA 2020), 2020, : 19 - 24
  • [48] Principal component analysis with georeferenced data An application in precision agriculture
    Cordoba, Mariano
    Balzarini, Monica
    Bruno, Cecilia
    Luis Costa, Jose
    REVISTA DE LA FACULTAD DE CIENCIAS AGRARIAS, 2012, 44 (01) : 27 - 39
  • [49] PROJECTION PURSUIT PRINCIPAL COMPONENT ANALYSIS AND ITS APPLICATION TO METEOROLOGY
    常红
    史久恩
    陈忠琏
    Journal of Meteorological Research, 1990, (02) : 254 - 263
  • [50] Noncircular Principal Component Analysis and Its Application to Model Selection
    Li, Xi-Lin
    Adali, Tuelay
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2011, 59 (10) : 4516 - 4528