Dear-PSM: A deep learning-based peptide search engine enables full database search for proteomics

被引:1
|
作者
He, Qingzu [1 ,2 ]
Li, Xiang [1 ]
Zhong, Jinjin [2 ,3 ]
Yang, Gen [2 ,4 ]
Han, Jiahuai [5 ]
Shuai, Jianwei [2 ,3 ]
机构
[1] Xiamen Univ, Natl Inst Data Sci Hlth & Med, Dept Phys, Xiamen, Peoples R China
[2] Univ Chinese Acad Sci, Wenzhou Inst, Wenzhou Key Lab Biophys, Wenzhou 325001, Zhejiang, Peoples R China
[3] Oujiang Lab, Zhejiang Lab Regenerat Med Vis & Brain Hlth, Wenzhou 325053, Zhejiang, Peoples R China
[4] Peking Univ, Sch Phys, State Key Lab Nucl Phys & Technol, Beijing, Peoples R China
[5] Xiamen Univ, Innovat Ctr Cell Signaling Network, Sch Life Sci, State Key Lab Cellular Stress Biol, Xiamen 361102, Fujian, Peoples R China
来源
SMART MEDICINE | 2024年 / 3卷 / 03期
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
deep learning; inverted index; mass spectrometry; peptide search; proteomics; SHOTGUN PROTEOMICS; MASS; IDENTIFICATION; TANDEM; SPECTRA; RATES;
D O I
10.1002/SMMD.20240014
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Peptide spectrum matching is the process of linking mass spectrometry data with peptide sequences. An experimental spectrum can match thousands of candidate peptides with variable modifications leading to an exponential increase in candidates. Completing the search within a limited time is a key challenge. Traditional searches expedite the process by restricting peptide mass errors and variable modifications, but this limits interpretive capability. To address this challenge, we propose Dear-PSM, a peptide search engine that supports full database searching. Dear-PSM does not restrict peptide mass errors, matching each spectrum to all peptides in the database and increasing the number of variable modifications per peptide from the conventional 3-20. Leveraging inverted index technology, Dear-PSM creates a high-performance index table of experimental spectra and utilizes deep learning algorithms for peptide validation. Through these techniques, Dear-PSM achieves a speed breakthrough 7 times faster than mainstream search engines on a regular desktop computer, with a remarkable 240-fold reduction in memory consumption. Benchmark test results demonstrate that Dear-PSM, in full database search mode, can reproduce over 90% of the results obtained by mainstream search engines when handling complex mass spectrometry data collected from different species using various instruments. Furthermore, it uncovers a substantial number of new peptides and proteins. Dear-PSM has been publicly released on the GitHub repository . The full database search strategy proposed in this study expands the search scope to include all peptide sequences within the database, with peptide mass tolerances extending to several thousand Daltons. Dear-PSM utilizes an inverted index algorithm to construct an index table for experimental spectra, enabling rapid searches, and employs deep learning algorithms for peptide validation. Moreover, Dear-PSM supports up to 20 variable modifications per peptide sequence and considers all possible combinations of these modifications, significantly expanding the peptide search space. image
引用
收藏
页数:13
相关论文
共 41 条
  • [31] A novel deep learning-based automatic search workflow for CO2 sequestration surrogate flow models
    Xu, Jianchun
    Fu, Qirun
    Li, Hangyu
    FUEL, 2023, 354
  • [32] A Deep Reinforcement Learning-Based Adaptive Search for Solving Time-Dependent Green Vehicle Routing Problem
    Yue, Bin
    Ma, Junxu
    Shi, Jinfa
    Yang, Jie
    IEEE ACCESS, 2024, 12 : 33400 - 33419
  • [33] Deep vs. Shallow Learning-based Filters of MS/MS Spectra in Support of Protein Search Engines
    Maabreh, Majdi
    Qolomany, Basheer
    Springstead, James
    Alsmadi, Izzat
    Gupta, Ajay
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 1175 - 1182
  • [34] Integrating Ebola optimization search algorithm for enhanced deep learning-based ransomware detection in Internet of Things security
    Alzahrani, Ibrahim R.
    Allafi, Randa
    AIMS MATHEMATICS, 2024, 9 (03): : 6784 - 6802
  • [35] Automatic improvement of deep learning-based cell segmentation in time-lapse microscopy by neural architecture search
    Zhu, Yanming
    Meijering, Erik
    BIOINFORMATICS, 2021, 37 (24) : 4844 - 4850
  • [36] CheReS: A Deep Learning-based Multi-faceted System for Similarity Search of Chest X-rays
    Mbilinyi, Ashery
    Schuldt, Heiko
    37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 669 - 676
  • [37] Integrated Modeling of Hybrid Nanofiltration/Reverse Osmosis Desalination Plant Using Deep Learning-Based Crow Search Optimization Algorithm
    Abba, Sani. I.
    Usman, Jamilu
    Abdulazeez, Ismail
    Lawal, Dahiru U.
    Baig, Nadeem
    Usman, A. G.
    Aljundi, Isam H.
    WATER, 2023, 15 (19)
  • [38] Deep Learning-based Search for Microlensing Signature from Binary Black Hole Events in GWTC-1 and-2
    Kim, Kyungmin
    Lee, Joongoo
    Hannuksela, Otto A.
    Li, Tjonnie G. F.
    ASTROPHYSICAL JOURNAL, 2022, 938 (02):
  • [39] Handling class imbalance of radio frequency interference in deep learning-based fast radio burst search pipelines using a deep convolutional generative adversarial network
    Wenlong Du
    Yanling Liu
    Maozheng Chen
    Astronomical Techniques and Instruments, 2025, 2 (01) : 10 - 15
  • [40] Deep learning-based energy prediction and tangent search remora optimization-based secure multi-path data communication mechanism in WSN
    Athinarayanasamy, Muthukrishnan
    Selvakumar, Karthi
    Sivasubbu, Veluchamy
    Kanakam, Michael Mahesh
    NETWORK-COMPUTATION IN NEURAL SYSTEMS, 2024,