Fast and Accurate Protein False Discovery Rates on Large-Scale Proteomics Data Sets with Percolator 3.0

被引：259

作者：

The, Matthew ^{[1
]}

MacCoss, Michael J. ^{[2
]}

Noble, William S. ^{[2
,3
]}

Kall, Lukas ^{[1
]}

机构：

[1] KTH Royal Inst Technol, Sci Life Lab, Sch Biotechnol, Box 1031, S-17121 Solna, Sweden

[2] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA 98195 USA

[3] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA

来源：

JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY | 2016年 / 27卷 / 11期

基金：

美国国家卫生研究院;

关键词：

Mass spectrometry - LC-MS/MS; Statistical analysis; Data processing and analysis; Protein inference; Large scale studies; TANDEM MASS-SPECTROMETRY; SHOTGUN PROTEOMICS; PEPTIDE IDENTIFICATION; SPECTRA; PROBABILITIES; DATABASES; INFERENCE; STRIKE;

D O I：

10.1007/s13361-016-1460-7

中图分类号：

Q5 [生物化学];

学科分类号：

071010 ; 081704 ;

摘要：

Percolator is a widely used software tool that increases yield in shotgun proteomics experiments and assigns reliable statistical confidence measures, such as q values and posterior error probabilities, to peptides and peptide-spectrum matches ( PSMs) from such experiments. Percolator's processing speed has been sufficient for typical data sets consisting of hundreds of thousands of PSMs. With our new scalable approach, we can now also analyze millions of PSMs in a matter of minutes on a commodity computer. Furthermore, with the increasing awareness for the need for reliable statistics on the protein level, we compared several easy-to-understand protein inference methods and implemented the best-performing method-grouping proteins by their corresponding sets of theoretical peptides and then considering only the best-scoring peptide for each protein-in the Percolator package. We used Percolator 3.0 to analyze the data from a recent study of the draft human proteome containing 25 million spectra (PM:24870542). The source code and Ubuntu, Windows, MacOS, and Fedora binary packages are available from http://percolator.ms/under an Apache 2.0 license.

引用

页码：1719 / 1727

页数：9

共 50 条

[31] How to talk about protein-level false discovery rates in shotgun proteomics
The, Matthew
Tasnim, Ayesha
Kall, Lukas
PROTEOMICS, 2016, 16 (18) : 2461 - 2469
[32] Weighted False Discovery Rate Control in Large-Scale Multiple Testing
Basu, Pallavi
Cai, T. Tony
Das, Kiranmoy
Sun, Wenguang
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2018, 113 (523) : 1172 - 1183
[33] The False Discovery Rate: A Key Concept in Large-Scale Genetic Studies
Chen, James J.
Roberson, Paula K.
Schell, Michael J.
CANCER CONTROL, 2010, 17 (01) : 58 - 62
[34] Fast Plagiarism Detection in Large-Scale Data
Szmit, Radoslaw
BEYOND DATABASES, ARCHITECTURES AND STRUCTURES: TOWARDS EFFICIENT SOLUTIONS FOR DATA ANALYSIS AND KNOWLEDGE REPRESENTATION, 2017, 716 : 329 - 343
[35] Fast Unsupervised Projection for Large-Scale Data
Wang, Jingyu
Wang, Lin
Nie, Feiping
Li, Xuelong
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 3634 - 3644
[36] iSwift: Fast and Accurate Impact Identification for Large-scale CDNs
Sun, Jiyan
Lin, Tao
Liu, Yinlong
Wang, Xin
Jiang, Bo
Geng, Liru
Jing, Pengkun
Dai, Liang
2022 IEEE/ACM 30TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2022,
[37] QuartetS: a fast and accurate algorithm for large-scale orthology detection
Yu, Chenggang
Zavaljevski, Nela
Desai, Valmik
Reifman, Jaques
NUCLEIC ACIDS RESEARCH, 2011, 39 (13) : e88
[38] Big data and false discovery: analyses of bibliometric indicators from large data sets
Prathap, Gangan
SCIENTOMETRICS, 2014, 98 (02) : 1421 - 1422
[39] Big data and false discovery: analyses of bibliometric indicators from large data sets
Gangan Prathap
Scientometrics, 2014, 98 : 1421 - 1422
[40] Feature selection for large-scale data sets in GrC
Liang, Jiye
2012 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING (GRC 2012), 2012, : 2 - 7

← 1 2 3 4 5 →