Dear-PSM: A deep learning-based peptide search engine enables full database search for proteomics

被引:1
|
作者
He, Qingzu [1 ,2 ]
Li, Xiang [1 ]
Zhong, Jinjin [2 ,3 ]
Yang, Gen [2 ,4 ]
Han, Jiahuai [5 ]
Shuai, Jianwei [2 ,3 ]
机构
[1] Xiamen Univ, Natl Inst Data Sci Hlth & Med, Dept Phys, Xiamen, Peoples R China
[2] Univ Chinese Acad Sci, Wenzhou Inst, Wenzhou Key Lab Biophys, Wenzhou 325001, Zhejiang, Peoples R China
[3] Oujiang Lab, Zhejiang Lab Regenerat Med Vis & Brain Hlth, Wenzhou 325053, Zhejiang, Peoples R China
[4] Peking Univ, Sch Phys, State Key Lab Nucl Phys & Technol, Beijing, Peoples R China
[5] Xiamen Univ, Innovat Ctr Cell Signaling Network, Sch Life Sci, State Key Lab Cellular Stress Biol, Xiamen 361102, Fujian, Peoples R China
来源
SMART MEDICINE | 2024年 / 3卷 / 03期
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
deep learning; inverted index; mass spectrometry; peptide search; proteomics; SHOTGUN PROTEOMICS; MASS; IDENTIFICATION; TANDEM; SPECTRA; RATES;
D O I
10.1002/SMMD.20240014
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Peptide spectrum matching is the process of linking mass spectrometry data with peptide sequences. An experimental spectrum can match thousands of candidate peptides with variable modifications leading to an exponential increase in candidates. Completing the search within a limited time is a key challenge. Traditional searches expedite the process by restricting peptide mass errors and variable modifications, but this limits interpretive capability. To address this challenge, we propose Dear-PSM, a peptide search engine that supports full database searching. Dear-PSM does not restrict peptide mass errors, matching each spectrum to all peptides in the database and increasing the number of variable modifications per peptide from the conventional 3-20. Leveraging inverted index technology, Dear-PSM creates a high-performance index table of experimental spectra and utilizes deep learning algorithms for peptide validation. Through these techniques, Dear-PSM achieves a speed breakthrough 7 times faster than mainstream search engines on a regular desktop computer, with a remarkable 240-fold reduction in memory consumption. Benchmark test results demonstrate that Dear-PSM, in full database search mode, can reproduce over 90% of the results obtained by mainstream search engines when handling complex mass spectrometry data collected from different species using various instruments. Furthermore, it uncovers a substantial number of new peptides and proteins. Dear-PSM has been publicly released on the GitHub repository . The full database search strategy proposed in this study expands the search scope to include all peptide sequences within the database, with peptide mass tolerances extending to several thousand Daltons. Dear-PSM utilizes an inverted index algorithm to construct an index table for experimental spectra, enabling rapid searches, and employs deep learning algorithms for peptide validation. Moreover, Dear-PSM supports up to 20 variable modifications per peptide sequence and considers all possible combinations of these modifications, significantly expanding the peptide search space. image
引用
收藏
页数:13
相关论文
共 41 条
  • [21] Enhanced Crow Search with Deep Learning-Based Cyberattack Detection in SDN-IoT Environment
    Motwakel, Abdelwahed
    Alrowais, Fadwa
    Tarmissi, Khaled
    Marzouk, Radwa
    Mohamed, Abdullah
    Zamani, Abu Sarwar
    Yaseen, Ishfaq
    Eldesouki, Mohamed I.
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 36 (03): : 3157 - 3173
  • [22] Improved Bald Eagle Search Optimization With Deep Learning-Based Cervical Cancer Detection and Classification
    Al Mazroa, Alanoud
    Ishak, Mohamad Khairi
    Aljarbouh, Ayman
    Mostafa, Samih M.
    IEEE ACCESS, 2023, 11 : 135175 - 135184
  • [23] Comparative database search engine analysis on massive tandem mass spectra of pork-based food products for halal proteomics
    Amir, Siti Hajar
    Yuswan, Mohd Hafis
    Aizat, Wan Mohd
    Mansor, Muhammad Kamaruzaman
    Desa, Mohd Nasir Mohd
    Yusof, Yus Aniza
    Song, Lai Kok
    Mustafa, Shuhaimi
    JOURNAL OF PROTEOMICS, 2021, 241
  • [24] Influenza Epidemic Trend Surveillance and Prediction Based on Search Engine Data: Deep Learning Model Study
    Yang, Liuyang
    Zhang, Ting
    Han, Xuan
    Yang, Jiao
    Sun, Yanxia
    Ma, Libing
    Chen, Jialong
    Li, Yanming
    Lai, Shengjie
    Li, Wei
    Feng, Luzhao
    Yang, Weizhong
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
  • [25] Deep Learning-based MSMS Spectra Reduction in Support of Running Multiple Protein Search Engines on Cloud
    Maabreh, Majdi
    Qolomany, Basheer
    Alsmadi, Izzat
    Gupta, Ajay
    2017 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2017, : 1909 - 1914
  • [26] Improved Bald Eagle Search Optimization with Synergic Deep Learning-Based Classification on Breast Cancer Imaging
    Hamza, Manar Ahmed
    Mengash, Hanan Abdullah
    Nour, Mohamed K.
    Alasmari, Naif
    Aziz, Amira Sayed A.
    Mohammed, Gouse Pasha
    Zamani, Abu Sarwar
    Abdelmageed, Amgad Atta
    CANCERS, 2022, 14 (24)
  • [27] A Deep Reinforcement Learning-Based Adaptive Large Neighborhood Search for Capacitated Electric Vehicle Routing Problems
    Wang, Chao
    Cao, Mengmeng
    Jiang, Hao
    Xiang, Xiaoshu
    Zhang, Xingyi
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2025, 9 (01): : 131 - 144
  • [28] Leveraging sparrow search optimization with deep learning-based cybersecurity detection in industrial internet of things environment
    Alrayes, Fatma S.
    Nemri, Nadhem
    Mansouri, Wahida
    Alshuhail, Asma
    Almukadi, Wafa Sulaiman
    Al-Sharafi, Ali M.
    Aljabri, Jawhara
    Nafie, Faisal Mohammed
    ALEXANDRIA ENGINEERING JOURNAL, 2025, 121 : 128 - 137
  • [29] Full-waveform LiDAR echo decomposition method based on deep learning and sparrow search algorithm
    Xu, Xiaobin
    Wang, Jiali
    Wu, Jialin
    Qu, Qinyang
    Ran, Yingying
    Tan, Zhiying
    Luo, Minzhou
    INFRARED PHYSICS & TECHNOLOGY, 2023, 130
  • [30] Capuchin Search Algorithm With Deep Learning-Based Data Edge Verification for Blockchain-Assisted IoT Environment
    Alyoubi, Khaled H.
    Khadidos, Adil O.
    Alshareef, Abdulrhman M.
    Hamed, Diaa
    Khadidos, Alaa O.
    Ragab, Mahmoud
    IEEE ACCESS, 2024, 12 : 351 - 360