Integrated approach for manual evaluation of peptides identified by searching protein sequence databases with tandem mass spectra

被引:161
|
作者
Chen, Y [1 ]
Kwon, SW [1 ]
Kim, SC [1 ]
Zhao, YM [1 ]
机构
[1] Univ Texas, SW Med Ctr, Dept Biochem, Dallas, TX 75390 USA
关键词
protein identification; manual evaluation; automated database search;
D O I
10.1021/pr049754t
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Quantitative proteomics relies on accurate protein identification, which often is carried out by automated searching of a sequence database with tandem mass spectra of peptides. When these spectra contain limited information, automated searches may lead to incorrect peptide identifications. It is therefore necessary to validate the identifications by careful manual inspection of the mass spectra. Not only is this task time-consuming, but the reliability of the validation varies with the experience of the analyst. Here, we report a systematic approach to evaluating peptide identifications made by automated search algorithms. The method is based on the principle that the candidate peptide sequence should adequately explain the observed fragment ions. Also, the mass errors of neighboring fragments should be similar. To evaluate our method, we studied tandem mass spectra obtained from tryptic digests of E. coli and HeLa cells. Candidate peptides were identified with the automated search engine Mascot and subjected to the manual validation method. The method found correct peptide identifications that were given low Mascot scores (e.g., 20-25) and incorrect peptide identifications that were given high Mascot scores (e.g., 40-50). The method comprehensively detected false results from searches designed to produce incorrect identifications. Comparison of the tandem mass spectra of synthetic candidate peptides to the spectra obtained from the complex peptide mixtures confirmed the accuracy of the evaluation method. Thus, the evaluation approach described here could help boost the accuracy of protein identification, increase number of peptides identified, and provide a step toward developing a more accurate next-generation algorithm for protein identification.
引用
收藏
页码:998 / 1005
页数:8
相关论文
共 43 条
  • [31] The paragon algorithm, a next generation search engine that uses sequence temperature values and feature probabilities to identify peptides from tandem mass spectra
    Shilov, Ignat V.
    Seymour, Sean L.
    Patel, Alpesh A.
    Loboda, Alex
    Tang, Wilfred H.
    Keating, Sean P.
    Hunter, Christie L.
    Nuwaysir, Lydia M.
    Schaeffer, Daniel A.
    MOLECULAR & CELLULAR PROTEOMICS, 2007, 6 (09) : 1638 - 1655
  • [32] N-terminal sequence tagging using reliably determined b2 ions: A useful approach to deconvolute tandem mass spectra of co-fragmented peptides in proteomics
    Kryuchkov, Fedor
    Verano-Braga, Thiago
    Kjeldsen, Frank
    JOURNAL OF PROTEOMICS, 2014, 103 : 254 - 260
  • [33] The H-Index of 'An Approach to Correlate Tandem Mass Spectral Data of Peptides with Amino Acid Sequences in a Protein Database'
    Washburn, Michael P.
    JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2015, 26 (11) : 1799 - 1803
  • [34] Protein analysis by mass spectrometry and sequence database searching:: A proteomic approach to identify human lymphoblastoid cell line proteins
    Joubert-Caron, R
    Le Caër, JP
    Montandon, F
    Poirier, F
    Pontet, M
    Imam, N
    Feuillard, J
    Bladier, D
    Rossier, J
    Caron, M
    ELECTROPHORESIS, 2000, 21 (12) : 2566 - 2575
  • [35] A suffix tree approach to the interpretation of tandem mass spectra: applications to peptides of non-specific digestion and post-translational modifications
    Lu, Bingwen
    Chen, Ting
    BIOINFORMATICS, 2003, 19 : II113 - II121
  • [36] Sequence protein identification by randomized sequence database and transcriptome mass spectrometry (SPIDER-TMS): from manual to automatic application of a 'de novo sequencing' approach
    Pascale, Raffaella
    Grossi, Gerarda
    Cruciani, Gabriele
    Mecca, Giansalvatore
    Santoro, Donatello
    Calace, Renzo Sarli
    Falabella, Patrizia
    Bianco, Giuliana
    EUROPEAN JOURNAL OF MASS SPECTROMETRY, 2016, 22 (04) : 193 - 198
  • [37] Non-targeted screening of trace organic contaminants in surface waters by a multi-tool approach based on combinatorial analysis of tandem mass spectra and open access databases
    Eysseric, Emmanuel
    Beaudry, Francis
    Gagnon, Christian
    Segura, Pedro A.
    TALANTA, 2021, 230
  • [38] The holm oak leaf proteome:: Analytical and biological variability in the protein expression level assessed by 2-DE and protein identification tandem mass spectrometry de novo sequencing and sequence similarity searching
    Jorge, I
    Navarro, RM
    Lenz, C
    Ariza, D
    Porras, C
    Jorrín, J
    PROTEOMICS, 2005, 5 (01) : 222 - 234
  • [39] On-line capillary liquid chromatography tandem mass spectrometry on an ion trap/reflectron time-of-flight mass spectrometer using the sequence tag database search approach for peptide sequencing and protein identification
    Huang, PQ
    Wall, DB
    Parus, S
    Lubman, DM
    JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2000, 11 (02) : 127 - 135
  • [40] Development of an alternative approach for detecting botulinum neurotoxin type A in honey: Analysis of non-toxic peptides with a reference labelled protein via liquid chromatography-tandem mass spectrometry
    Koike, Hiroshi
    Kanda, Maki
    Hayashi, Hairoshi
    Matsushima, Yoko
    Yoshikawa, Souichi
    Ohba, Yumi
    Hayashi, Momoka
    Nagano, Chieko
    Sekimura, Kotaro
    Otsuka, Kenji
    Kamiie, Junichi
    Sasamoto, Takeo
    Hashimoto, Tsuneo
    FOOD ADDITIVES AND CONTAMINANTS PART A-CHEMISTRY ANALYSIS CONTROL EXPOSURE & RISK ASSESSMENT, 2020, 37 (08): : 1359 - 1373