Evaluating Proteomics Imputation Methods with Improved Criteria

被引:4
|
作者
Harris, Lincoln [1 ]
Fondrie, William E. [2 ]
Oh, Sewoong [3 ]
Noble, William S. [1 ,3 ]
机构
[1] Univ Washington, Dept Genome Sci, Seattle, WA 98195 USA
[2] Talus Biosci, Seattle, WA 98112 USA
[3] Univ Washington, Paul G Allen Sch Comp Sci & Engn, Seattle, WA 98195 USA
关键词
quantitative mass spectrometry; proteomics; imputation; machine learning; statistics; differential expression; lower limit of quantification; MISSING VALUE IMPUTATION; MASS SPECTROMETRY; R-PACKAGE; SETS;
D O I
10.1021/acs.jproteome.3c00205
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Quantitative measurements produced by tandem mass spectrometry proteomics experiments typically contain a large proportion of missing values. Missing values hinder reproducibility, reduce statistical power, and make it difficult to compare across samples or experiments. Although many methods exist for imputing missing values, in practice, the most commonly used methods are among the worst performing. Furthermore, previous benchmarking studies have focused on relatively simple measurements of error such as the mean-squared error between imputed and held-out values. Here we evaluate the performance of commonly used imputation methods using three practical, "downstream-centric" criteria. These criteria measure the ability to identify differentially expressed peptides, generate new quantitative peptides, and improve the peptide lower limit of quantification. Our evaluation comprises several experiment types and acquisition strategies, including data-dependent and data-independent acquisition. We find that imputation does not necessarily improve the ability to identify differentially expressed peptides but that it can identify new quantitative peptides and improve the peptide lower limit of quantification. We find that MissForest is generally the best performing method per our downstream-centric criteria. We also argue that existing imputation methods do not properly account for the variance of peptide quantifications and highlight the need for methods that do.
引用
收藏
页码:3427 / 3438
页数:12
相关论文
共 50 条
  • [41] Some improved and alternative imputation methods for finite population mean in presence of missing information
    Singh, Garib Nath
    Pandey, Awadhesh K.
    Sharma, Anup Kumar
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2021, 50 (19) : 4401 - 4427
  • [42] Evaluating the Impact of Missing Data Imputation
    Pantanowitz, Adam
    Marwala, Tshildzi
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2009, 5678 : 577 - 586
  • [43] Some improved and alternative imputation methods for finite population mean in presence of missing information
    Singh, Garib Nath
    Pandey, Awadhesh K.
    Sharma, Anup Kumar
    Communications in Statistics - Theory and Methods, 2021, 50 (19): : 4401 - 4427
  • [44] Evaluating Imputation Methods for rainfall data under high variability in Johor River Basin, Malaysia
    Sa'adi, Zulfaqar
    Yusop, Zulkifli
    Alias, Nor Eliza
    Chow, Ming Fai
    Muhammad, Mohd Khairul Idlan
    Ramli, Muhammad Wafiy Adli
    Iqbal, Zafar
    Shiru, Mohammed Sanusi
    Rohmat, Faizal Immaddudin Wira
    Mohamad, Nur Athirah
    Ahmad, Mohamad Faizal
    APPLIED COMPUTING AND GEOSCIENCES, 2023, 20
  • [45] Evaluating the performance of dropout imputation and clustering methods for single-cell RNA sequencing data
    Xu, Junlin
    Cui, Lingyu
    Zhuang, Jujuan
    Meng, Yajie
    Bing, Pingping
    He, Binsheng
    Tian, Geng
    Pui, Choi Kwok
    Wu, Taoyang
    Wang, Bing
    Yang, Jialiang
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 146
  • [46] Characterizing the effects of missing data and evaluating imputation methods for chemical prioritization applications using ToxPi
    Kimberly T. To
    Rebecca C. Fry
    David M. Reif
    BioData Mining, 11
  • [47] Characterizing the effects of missing data and evaluating imputation methods for chemical prioritization applications using ToxPi
    To, Kimberly T.
    Fry, Rebecca C.
    Reif, David M.
    BIODATA MINING, 2018, 11
  • [48] Missing Data and Imputation Methods
    Schober, Patrick
    Vetter, Thomas R.
    ANESTHESIA AND ANALGESIA, 2020, 131 (05): : 1419 - 1420
  • [49] Assessment of genotype imputation methods
    Joanna M Biernacka
    Rui Tang
    Jia Li
    Shannon K McDonnell
    Kari G Rabe
    Jason P Sinnwell
    David N Rider
    Mariza de Andrade
    Ellen L Goode
    Brooke L Fridley
    BMC Proceedings, 3 (Suppl 7)
  • [50] Evaluating various meal criteria methods for analyzing chewing data.
    Maulfair, D. D.
    Zanton, G. I.
    Heinrichs, A. J.
    JOURNAL OF DAIRY SCIENCE, 2010, 93 : 726 - 727