Evaluating Proteomics Imputation Methods with Improved Criteria

被引:4
|
作者
Harris, Lincoln [1 ]
Fondrie, William E. [2 ]
Oh, Sewoong [3 ]
Noble, William S. [1 ,3 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Talus Biosci, Seattle, WA 98112 USA
[3] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
关键词
quantitative mass spectrometry; proteomics; imputation; machine learning; statistics; differential expression; lower limit of quantification; MISSING VALUE IMPUTATION; MASS SPECTROMETRY; R-PACKAGE; SETS;
D O I
10.1021/acs.jproteome.3c00205
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Quantitative measurements produced by tandem mass spectrometry proteomics experiments typically contain a large proportion of missing values. Missing values hinder reproducibility, reduce statistical power, and make it difficult to compare across samples or experiments. Although many methods exist for imputing missing values, in practice, the most commonly used methods are among the worst performing. Furthermore, previous benchmarking studies have focused on relatively simple measurements of error such as the mean-squared error between imputed and held-out values. Here we evaluate the performance of commonly used imputation methods using three practical, "downstream-centric" criteria. These criteria measure the ability to identify differentially expressed peptides, generate new quantitative peptides, and improve the peptide lower limit of quantification. Our evaluation comprises several experiment types and acquisition strategies, including data-dependent and data-independent acquisition. We find that imputation does not necessarily improve the ability to identify differentially expressed peptides but that it can identify new quantitative peptides and improve the peptide lower limit of quantification. We find that MissForest is generally the best performing method per our downstream-centric criteria. We also argue that existing imputation methods do not properly account for the variance of peptide quantifications and highlight the need for methods that do.
引用
收藏
页码:3427 / 3438
页数:12
相关论文
共 50 条
  • [1] A comparative study of evaluating missing value imputation methods in label-free proteomics
    Liang Jin
    Yingtao Bi
    Chenqi Hu
    Jun Qu
    Shichen Shen
    Xue Wang
    Yu Tian
    Scientific Reports, 11
  • [2] A comparative study of evaluating missing value imputation methods in label-free proteomics
    Jin, Liang
    Bi, Yingtao
    Hu, Chenqi
    Qu, Jun
    Shen, Shichen
    Wang, Xue
    Tian, Yu
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [3] Benchmarking differential expression, imputation and quantification methods for proteomics data
    Lin, Miao-Hsia
    Wu, Pei-Shan
    Wong, Tzu-Hsuan
    Lin, I-Ying
    Lin, Johnathan
    Cox, Juergen
    Yu, Sung-Huan
    BRIEFINGS IN BIOINFORMATICS, 2022, 23 (03)
  • [4] Improved methods for the imputation of missing data by nearest neighbor methods
    Tutz, Gerhard
    Ramzan, Shahla
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2015, 90 : 84 - 99
  • [5] Evaluating Imputation Methods for Missing Data in a MCI Dataset
    Gomez-Valades Batanero, Alba
    Rincon Zamorano, Mariano
    Martinez Tomas, Rafael
    Guerrero Martin, Juan
    ARTIFICIAL INTELLIGENCE IN NEUROSCIENCE: AFFECTIVE ANALYSIS AND HEALTH APPLICATIONS, PT I, 2022, 13258 : 446 - 454
  • [6] Editorial: Evaluating automatic edit and imputation methods, and the EUREDIT project
    Charlton, J
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2004, 167 : 199 - 207
  • [7] METHODS AND CRITERIA FOR EVALUATING SCIENTIFIC ACTIVITY
    RAIZBERG, B
    KOZHEVNIKOVA, G
    PROBLEMS OF ECONOMICS, 1978, 20 (10): : 35 - 54
  • [8] CRITERIA FOR EVALUATING FUZZY RANKING METHODS
    YUAN, YF
    FUZZY SETS AND SYSTEMS, 1991, 43 (02) : 139 - 157
  • [9] Evaluating missing value imputation methods for food composition databases
    Ispirova, Gordana
    Eftimov, Tome
    Seljak, Barbara Korousic
    FOOD AND CHEMICAL TOXICOLOGY, 2020, 141
  • [10] Evaluating Performance of Missing Data Imputation Methods in IRT Analyses
    Kalkan, Omur Kaya
    Kara, Yusuf
    Kelecioglu, Hulya
    INTERNATIONAL JOURNAL OF ASSESSMENT TOOLS IN EDUCATION, 2018, 5 (03): : 403 - 416