Treatment of missing values for multivariate statistical analysis of gel-based proteomics data

被引:46
|
作者
Pedreschi, Romina [1 ]
Hertog, Maarten L. A. T. M. [1 ]
Carpentier, Sebastien C. [2 ]
Lammertyn, Jeroen [1 ]
Robben, Johan [3 ,4 ]
Noben, Jean-Paul [3 ,4 ]
Panis, Bart [2 ]
Swennen, Rony [2 ]
Nicolai, Bart M. [1 ]
机构
[1] Katholieke Univ Leuven, BIOSYST MeBioS Div, B-3001 Heverlee, Belgium
[2] Katholieke Univ Leuven, Div Crop Biotech, Louvain, Belgium
[3] Transnatl Univ Limburg, Hasselt Univ, Biomed Res Inst, Diepenbeek, Belgium
[4] Transnatl Univ Limburg, Sch Life Sci, Biomed Res Inst, Diepenbeek, Belgium
关键词
DIGE; missing value; postrun staining; preprocessing; statistics;
D O I
10.1002/pmic.200700975
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The presence of missing values in gel-based proteomics data represents a real challenge if an objective statistical analysis is pursued. Different methods to handle missing values were evaluated and their influence is discussed on the selection of important proteins through multivariate techniques. The evaluated methods consisted of directly dealing with them during the multivariate analysis with the nonlinear estimation by iterative partial least squares (NIPALS) algorithm or imputing them by using either k-nearest neighbor or Bayesian principal component analysis (BPCA) before carrying out the multivariate analysis. These techniques were applied to data obtained from gels stained with classical postrunning dyes and from DIGE gels. Before applying the multivariate techniques, the normality and homoscedasticity assumptions on which parametric tests are based on were tested in order to perform a sound statistical analysis. From the three tested methods to handle missing values in our datasets, BPCA imputation of missing values showed to be the most consistent method.
引用
收藏
页码:1371 / 1383
页数:13
相关论文
共 50 条
  • [1] Missing values in gel-based proteomics
    Albrecht, Daniela
    Kniemeyer, Olaf
    Brakhage, Axel A.
    Guthke, Reinhard
    [J]. PROTEOMICS, 2010, 10 (06) : 1202 - 1211
  • [2] The promise of gel-based proteomics
    Smejkal, Gary B.
    [J]. EXPERT OPINION ON DRUG DISCOVERY, 2006, 1 (01) : 7 - 10
  • [3] Data Visualization and Feature Selection Methods in Gel-based Proteomics
    Silva, Tome S.
    Richard, Nadege
    Dias, Jorge P.
    Rodrigues, Pedro M.
    [J]. CURRENT PROTEIN & PEPTIDE SCIENCE, 2014, 15 (01) : 4 - 22
  • [4] Highlights on the capacities of "Gel-based" proteomics
    Chevalier, Francois
    [J]. PROTEOME SCIENCE, 2010, 8
  • [5] Highlights on the capacities of "Gel-based" proteomics
    François Chevalier
    [J]. Proteome Science, 8
  • [6] Gel-based methods in redox proteomics
    Charles, Rebecca
    Jayawardhana, Tamani
    Eaton, Philip
    [J]. BIOCHIMICA ET BIOPHYSICA ACTA-GENERAL SUBJECTS, 2014, 1840 (02): : 830 - 837
  • [7] Robust likelihood-based analysis of multivariate data with missing values
    Little, R
    An, HG
    [J]. STATISTICA SINICA, 2004, 14 (03) : 949 - 968
  • [8] MISSING VALUES IN MULTIVARIATE DATA
    KUZMA, JW
    [J]. BIOMETRICS, 1965, 21 (01) : 254 - &
  • [9] Fluorescence data analysis on gel-based biochips
    Barsky, V
    Perov, A
    Tokalov, S
    Chudinov, A
    Kreindlin, E
    Sharonov, A
    Kotova, E
    Mirzabekov, A
    [J]. JOURNAL OF BIOMOLECULAR SCREENING, 2002, 7 (03) : 247 - 257
  • [10] Dealing with missing values in proteomics data
    Kong, Weijia
    Hui, Harvard Wai Hann
    Peng, Hui
    Bin Goh, Wilson Wen
    [J]. PROTEOMICS, 2022, 22 (23-24)