Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0

被引:259
|
作者
The, Matthew [1 ]
MacCoss, Michael J. [2 ]
Noble, William S. [2 ,3 ]
Kall, Lukas [1 ]
机构
[1] KTH Royal Inst Technol, Sci Life Lab, Sch Biotechnol, Box 1031, S-17121 Solna, Sweden
[2] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA
[3] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
Mass spectrometry - LC-MS/MS; Statistical analysis; Data processing and analysis; Protein inference; Large scale studies; TANDEM MASS-SPECTROMETRY; SHOTGUN PROTEOMICS; PEPTIDE IDENTIFICATION; SPECTRA; PROBABILITIES; DATABASES; INFERENCE; STRIKE;
D O I
10.1007/s13361-016-1460-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Percolator is a widely used software tool that increases yield in shotgun proteomics experiments and assigns reliable statistical confidence measures, such as q values and posterior error probabilities, to peptides and peptide-spectrum matches ( PSMs) from such experiments. Percolator's processing speed has been sufficient for typical data sets consisting of hundreds of thousands of PSMs. With our new scalable approach, we can now also analyze millions of PSMs in a matter of minutes on a commodity computer. Furthermore, with the increasing awareness for the need for reliable statistics on the protein level, we compared several easy-to-understand protein inference methods and implemented the best-performing method-grouping proteins by their corresponding sets of theoretical peptides and then considering only the best-scoring peptide for each protein-in the Percolator package. We used Percolator 3.0 to analyze the data from a recent study of the draft human proteome containing 25 million spectra (PM:24870542). The source code and Ubuntu, Windows, MacOS, and Fedora binary packages are available from http://percolator.ms/under an Apache 2.0 license.
引用
收藏
页码:1719 / 1727
页数:9
相关论文
共 50 条
  • [21] Accurate Estimation of Context-Dependent False Discovery Rates in Top-Down Proteomics
    LeDuc, Richard D.
    Fellers, Ryan T.
    Early, Bryan P.
    Greer, Joseph B.
    Shams, Daniel P.
    Thomas, Paul M.
    Kelleher, Neil L.
    MOLECULAR & CELLULAR PROTEOMICS, 2019, 18 (04) : 796 - 805
  • [22] Proteomics beyond large-scale protein expression analysis
    Boersema, Paul J.
    Kahraman, Abdullah
    Picotti, Paola
    CURRENT OPINION IN BIOTECHNOLOGY, 2015, 34 : 162 - 170
  • [23] Valid data from large-scale proteomics studies
    Daniel Chamrad
    Helmut E Meyer
    Nature Methods, 2005, 2 : 647 - 648
  • [24] Valid data from large-scale proteomics studies
    Chamrad, D
    Meyer, HE
    NATURE METHODS, 2005, 2 (09) : 647 - 648
  • [25] Fast, accurate reconstruction of cell Lineages from Large-scale fluorescence microscopy data
    Amat, Fernando
    Lemon, William
    Mossing, Daniel P.
    McDole, Katie
    Wan, Yinan
    Branson, Kristin
    Myers, Eugene W.
    Keller, Philipp J.
    NATURE METHODS, 2014, 11 (09) : 951 - 958
  • [26] Fast, accurate reconstruction of cell lineages from large-scale fluorescence microscopy data
    Amat F.
    Lemon W.
    Mossing D.P.
    McDole K.
    Wan Y.
    Branson K.
    Myers E.W.
    Keller P.J.
    Nature Methods, 2014, 11 (9) : 951 - 958
  • [27] KYSS: Mass spectrometry data quality assessment for protein analysis and large-scale proteomics
    Such-Sanmartin, Gerard
    Sidoli, Simone
    Ventura-Espejo, Estela
    Jensen, Ole N.
    BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2014, 445 (04) : 702 - 707
  • [28] Fast and accurate interpolation of large scattered data sets on the sphere
    Cavoretto, Roberto
    De Rossi, Alessandra
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2010, 234 (05) : 1505 - 1521
  • [29] Determination of Burn Patient Outcome by Large-Scale Quantitative Discovery Proteomics
    Finnerty, Celeste C.
    Jeschke, Marc G.
    Qian, Wei-Jun
    Kaushal, Amit
    Xiao, Wenzhong
    Liu, Tao
    Gritsenko, Marina A.
    Moore, Ronald J.
    Camp, David G., II
    Moldawer, Lyle L.
    Elson, Constance
    Schoenfeld, David
    Gamelli, Richard
    Gibran, Nicole
    Klein, Matthew
    Arnoldo, Brett
    Remick, Daniel
    Smith, Richard D.
    Davis, Ronald
    Tompkins, Ronald G.
    Herndon, David N.
    CRITICAL CARE MEDICINE, 2013, 41 (06) : 1421 - 1434
  • [30] Directional false discovery rate control in large-scale multiple comparisons
    Liang, Wenjuan
    Xiang, Dongdong
    Mei, Yajun
    Li, Wendong
    JOURNAL OF APPLIED STATISTICS, 2024, 51 (15) : 3195 - 3214