Treatment of missing values for multivariate statistical analysis of gel-based proteomics data

被引：46

作者：

Pedreschi, Romina ^{[1
]}

Hertog, Maarten L. A. T. M. ^{[1
]}

Carpentier, Sebastien C. ^{[2
]}

Lammertyn, Jeroen ^{[1
]}

Robben, Johan ^{[3
,4
]}

Noben, Jean-Paul ^{[3
,4
]}

Panis, Bart ^{[2
]}

Swennen, Rony ^{[2
]}

Nicolai, Bart M. ^{[1
]}

机构：

[1] Katholieke Univ Leuven, BIOSYST MeBioS Div, B-3001 Heverlee, Belgium

[2] Katholieke Univ Leuven, Div Crop Biotech, Louvain, Belgium

[3] Transnatl Univ Limburg, Hasselt Univ, Biomed Res Inst, Diepenbeek, Belgium

[4] Transnatl Univ Limburg, Sch Life Sci, Biomed Res Inst, Diepenbeek, Belgium

来源：

PROTEOMICS | 2008年 / 8卷 / 07期

关键词：

DIGE; missing value; postrun staining; preprocessing; statistics;

D O I：

10.1002/pmic.200700975

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

The presence of missing values in gel-based proteomics data represents a real challenge if an objective statistical analysis is pursued. Different methods to handle missing values were evaluated and their influence is discussed on the selection of important proteins through multivariate techniques. The evaluated methods consisted of directly dealing with them during the multivariate analysis with the nonlinear estimation by iterative partial least squares (NIPALS) algorithm or imputing them by using either k-nearest neighbor or Bayesian principal component analysis (BPCA) before carrying out the multivariate analysis. These techniques were applied to data obtained from gels stained with classical postrunning dyes and from DIGE gels. Before applying the multivariate techniques, the normality and homoscedasticity assumptions on which parametric tests are based on were tested in order to perform a sound statistical analysis. From the three tested methods to handle missing values in our datasets, BPCA imputation of missing values showed to be the most consistent method.

引用

页码：1371 / 1383

页数：13

共 50 条

[1] Missing values in gel-based proteomics
Albrecht, Daniela
Kniemeyer, Olaf
Brakhage, Axel A.
Guthke, Reinhard
[J]. PROTEOMICS, 2010, 10 (06) : 1202 - 1211
[2] The promise of gel-based proteomics
Smejkal, Gary B.
[J]. EXPERT OPINION ON DRUG DISCOVERY, 2006, 1 (01) : 7 - 10
[3] Data Visualization and Feature Selection Methods in Gel-based Proteomics
Silva, Tome S.
Richard, Nadege
Dias, Jorge P.
Rodrigues, Pedro M.
[J]. CURRENT PROTEIN & PEPTIDE SCIENCE, 2014, 15 (01) : 4 - 22
[4] Highlights on the capacities of "Gel-based" proteomics
Chevalier, Francois
[J]. PROTEOME SCIENCE, 2010, 8
[5] Highlights on the capacities of "Gel-based" proteomics
François Chevalier
[J]. Proteome Science, 8
[6] Gel-based methods in redox proteomics
Charles, Rebecca
Jayawardhana, Tamani
Eaton, Philip
[J]. BIOCHIMICA ET BIOPHYSICA ACTA-GENERAL SUBJECTS, 2014, 1840 (02): : 830 - 837
[7] Robust likelihood-based analysis of multivariate data with missing values
Little, R
An, HG
[J]. STATISTICA SINICA, 2004, 14 (03) : 949 - 968
[8] MISSING VALUES IN MULTIVARIATE DATA
KUZMA, JW
[J]. BIOMETRICS, 1965, 21 (01) : 254 - &
[9] Fluorescence data analysis on gel-based biochips
Barsky, V
Perov, A
Tokalov, S
Chudinov, A
Kreindlin, E
Sharonov, A
Kotova, E
Mirzabekov, A
[J]. JOURNAL OF BIOMOLECULAR SCREENING, 2002, 7 (03) : 247 - 257
[10] Dealing with missing values in proteomics data
Kong, Weijia
Hui, Harvard Wai Hann
Peng, Hui
Bin Goh, Wilson Wen
[J]. PROTEOMICS, 2022, 22 (23-24)

← 1 2 3 4 5 →