PhoStar: Identifying Tandem Mass Spectra of Phosphorylated Peptides before Database Search

被引:8
|
作者
Dorl, Sebastian [1 ]
Winkler, Stephan [1 ]
Mechtler, Karl [2 ,3 ]
Dorfer, Viktoria [1 ]
机构
[1] Univ Appl Sci Upper Austria, Bioinformat Res Grp, Softwarepk 11, D-4232 Hagenberg, Germany
[2] Res Inst Mol Pathol IMP, Prot Chem, Campus Vienna Bioctr 1, A-1030 Vienna, Austria
[3] Inst Mol Biotechnol IMBA, Prot Chem, Chem Vienna Bioctr VBC, Dr Bohr Gasse 3, A-1030 Vienna, Austria
基金
奥地利科学基金会;
关键词
mass spectrometry; proteomics; post-translational modification; phosphorylation; search space reduction; machine learning random forest classification; POSTTRANSLATIONAL MODIFICATIONS; PROTEIN IDENTIFICATION; SPECTROMETRY; ALGORITHM; TOOL; PREDICTION;
D O I
10.1021/acs.jproteome.7b00563
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Standard proteomics workflows use tandem mass spectrometry followed by sequence database search to analyze complex biological samples. The identification of proteins carrying post-translational modifications, for example, phosphorylation, is typically addressed by allowing variable modifications in the searched sequences. Accounting for these variations exponentially increases the combinatorial space in the database, which leads to increased processing times and more false positive identifications. The here-presented tool PhoStar identifies spectra that originate from phosphorylated peptides before database search using a supervised machine learning approach. The model for the prediction of phosphorylation was trained and validated with an accuracy of 97.6% on a large set of high-confidence spectra collected from publicly available experimental data. Its power was further validated by predicting phosphorylation in the complete NIST human and mouse high collision dissociation spectral libraries, achieving an accuracy of 98.2 and 97.9%, respectively. We demonstrate the application of PhoStar by using it for spectra filtering before database search. In database search of HeLa samples the peptide search space was reduced by 27-66% while finding at least 97% of total peptide identifications (at 1% FDR) compared with a standard workflow.
引用
收藏
页码:290 / 295
页数:6
相关论文
共 50 条
  • [1] Peptide Identification by Database Search of Mixture Tandem Mass Spectra
    Wang, Jian
    Bourne, Philip E.
    Bandeira, Nuno
    MOLECULAR & CELLULAR PROTEOMICS, 2011, 10 (12)
  • [2] Comparing the tandem mass spectra of phosphorylated peptides acquired by using a different type of mass spectrometer
    Wang, J.
    Sui, S.
    Zhang, Y.
    Gong, Y.
    Guo, L.
    Cai, Y.
    Qian, X.
    MOLECULAR & CELLULAR PROTEOMICS, 2005, 4 (08) : S284 - S284
  • [3] A SEARCH FOR EVIDENCE OF SECONDARY STRUCTURE IN THE TANDEM MASS-SPECTRA OF PEPTIDES
    BURSEY, MM
    ERICKSON, BW
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1989, 197 : 38 - ANYL
  • [4] Crescendo: A Protein Sequence Database Search Engine for Tandem Mass Spectra
    Wang, Jianqi
    Zhang, Yajie
    Yu, Yonghao
    JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY, 2015, 26 (07) : 1077 - 1084
  • [5] Deconvolution and Database Search of Complex Tandem Mass Spectra of Intact Proteins
    Liu, Xiaowen
    Inbar, Yuval
    Dorrestein, Pieter C.
    Wynne, Colin
    Edwards, Nathan
    Souda, Puneet
    Whitelegge, Julian P.
    Bafna, Vineet
    Pevzner, Pavel A.
    MOLECULAR & CELLULAR PROTEOMICS, 2010, 9 (12) : 2772 - 2782
  • [6] A database search algorithm for identification of peptides with multiple charges using tandem mass spectrometry
    Ning, K
    Chong, KF
    Leong, HW
    DATA MINING FOR BIOMEDICAL APPLICATIONS, PROCEEDINGS, 2006, 3916 : 2 - 13
  • [7] Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine
    Chi, Hao
    Liu, Chao
    Yang, Hao
    Zeng, Wen-Feng
    Wu, Long
    Zhou, Wen-Jing
    Wang, Rui-Min
    Niu, Xiu-Nan
    Ding, Yue-He
    Zhang, Yao
    Wang, Zhao-Wei
    Chen, Zhen-Lin
    Sun, Rui-Xiang
    Liu, Tao
    Tan, Guang-Ming
    Dong, Meng-Qiu
    Xu, Ping
    Zhang, Pei-Heng
    He, Si-Min
    NATURE BIOTECHNOLOGY, 2018, 36 (11) : 1059 - +
  • [8] Comprehensive identification of peptides in tandem mass spectra using an efficient open search engine
    Hao Chi
    Chao Liu
    Hao Yang
    Wen-Feng Zeng
    Long Wu
    Wen-Jing Zhou
    Rui-Min Wang
    Xiu-Nan Niu
    Yue-He Ding
    Yao Zhang
    Zhao-Wei Wang
    Zhen-Lin Chen
    Rui-Xiang Sun
    Tao Liu
    Guang-Ming Tan
    Meng-Qiu Dong
    Ping Xu
    Pei-Heng Zhang
    Si-Min He
    Nature Biotechnology, 2018, 36 : 1059 - 1061
  • [10] Spectral Dictionaries INTEGRATING DE NOVO PEPTIDE SEQUENCING WITH DATABASE SEARCH OF TANDEM MASS SPECTRA
    Kim, Sangtae
    Gupta, Nitin
    Bandeira, Nuno
    Pevzner, Pavel A.
    MOLECULAR & CELLULAR PROTEOMICS, 2009, 8 (01) : 53 - 69