A protein identification algorithm for tandem mass spectrometry by incorporating the abundance of mRNA into a binomial probability scoring model

被引:2
|
作者
Ma, Wen-Tai [1 ]
Liu, Zhao-Yu [1 ]
Chen, Xiao-Zhou [2 ]
Lin, Zhen-Liang [3 ]
Zheng, Zhong-Bing [1 ]
Miao, Wei-Guo [1 ]
Xie, Shang-Qian [1 ]
机构
[1] Hainan Univ, Inst Trop Agr & Forestry, Haikou 570228, Hainan, Peoples R China
[2] Yunnan Minzu Univ, Sch Math & Comp Sci, Kunming 650031, Yunnan, Peoples R China
[3] Wenzhou Med Univ, Dept Gen Surg, Affiliated Cangnan Hosp, Wenzhou 325800, Peoples R China
基金
中国国家自然科学基金;
关键词
Tandem mass spectrometry; RNA-seq; FPKM; Scoring model; Proteomics; SEQ; TRANSCRIPTOME; PROTEOMICS; DISCOVERY; CORRELATE; PEPTIDES;
D O I
10.1016/j.jprot.2019.02.010
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Peptide-spectrum matches (PSM) scoring between the experimental and theoretical spectrum is a key step in the identification of proteins using mass spectrometry (MS)-based proteomics analyses. Efficient protein identification using MS/MS data remains a challenge. The strategy of using RNA-seq data increases the number of proteins identified by re-constructing the custom search database and integrating mRNA abundance into the false discovery rate of post-PSM. However, this process lacks an algorithm that can allow the incorporation of mRNA abundance into the key scoring model of PSM. Therefore, we developed a novel PSM scoring model, which incorporates mRNA abundance for improved peptide and protein identification. In the new algorithm, abundance information of mRNA was transformed to the prior probability of protein identification and integrated to re-score in PSM using the binomial probability distribution model. Compared with other algorithms using five MS/MS datasets, the results showed that the least improvement ratios of peptide and protein groups were 3.39%-9.79% and 0.48%-8.16% in different datasets (human, rat, zebrafish, yeast, and Arabidopsis thaliana). The new strategy offers an effective solution for MS-based identification of peptides and proteins. Significance: The new algorithm identifies proteins by quantifying mRNA abundance (FPKM) and incorporating it into a scoring model for peptide-spectrum matches. It is important to improve peptide and protein identification from MS/MS datasets in proteomics research.
引用
收藏
页码:53 / 59
页数:7
相关论文
共 50 条
  • [1] Binomial Probability Distribution Model-Based Protein Identification Algorithm for Tandem Mass Spectrometry Utilizing Peak Intensity Information
    Xiao, Chuan-Le
    Chen, Xiao-Zhou
    Du, Yang-Li
    Sun, Xuesong
    Zhang, Gong
    He, Qing-Yu
    JOURNAL OF PROTEOME RESEARCH, 2013, 12 (01) : 328 - 335
  • [2] A mass accuracy sensitive probability based scoring algorithm for database searching of tandem mass spectrometry data
    Hua Xu
    Michael A Freitas
    BMC Bioinformatics, 8
  • [3] A mass accuracy sensitive probability based scoring algorithm for database searching of tandem mass spectrometry data
    Xu, Hua
    Freitas, Michael A.
    BMC BIOINFORMATICS, 2007, 8
  • [4] SQID: An Intensity-Incorporated Protein Identification Algorithm for Tandem Mass Spectrometry
    Li, Wenzhou
    Ji, Li
    Goya, Jonathan
    Tan, Guanhong
    Wysocki, Vicki H.
    JOURNAL OF PROTEOME RESEARCH, 2011, 10 (04) : 1593 - 1602
  • [5] Probabilistic Consensus Scoring Improves Tandem Mass Spectrometry Peptide Identification
    Nahnsen, Sven
    Bertsch, Andreas
    Rahnenfuehrer, Joerg
    Nordheim, Alfred
    Kohlbacher, Oliver
    JOURNAL OF PROTEOME RESEARCH, 2011, 10 (08) : 3332 - 3343
  • [6] Isotope abundance analysis for improved sample identification with tandem mass spectrometry
    Alon, Tal
    Amirav, Aviv
    RAPID COMMUNICATIONS IN MASS SPECTROMETRY, 2009, 23 (23) : 3668 - 3672
  • [7] Tandem Mass Spectrometry Protein Identification on a PC Grid
    Zosso, D.
    Podvinec, M.
    Mueller, M.
    Aebersold, R.
    Peitsch, M. C.
    Schwede, T.
    FROM GENES TO PERSONALIZED HEALTHCARE: GRID SOLUTIONS FOR THE LIFE SCIENCES, 2007, 126 : 3 - +
  • [8] Towards the Prediction of Protein Abundance from Tandem Mass Spectrometry Data
    Bonner, Anthony J.
    Liu, Han
    PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 599 - 603
  • [9] SPICI, a novel scoring function for peptide identification via tandem mass spectrometry
    Chen, L.
    Rajagopal, G.
    MOLECULAR & CELLULAR PROTEOMICS, 2006, 5 (10) : S52 - S52
  • [10] A novel scoring schema for peptide identification by searching protein sequence databases using tandem mass spectrometry data
    Zhuo Zhang
    Shiwei Sun
    Xiaopeng Zhu
    Suhua Chang
    Xiaofei Liu
    Chungong Yu
    Dongbo Bu
    Runsheng Chen
    BMC Bioinformatics, 7