A universal database reduction method based on the sequence tag strategy to facilitate large-scale database search in proteomics

被引:2
|
作者
Wang, Kai-Fei [1 ,2 ]
Wu, Yu-Zhuo [1 ,2 ]
Chi, Hao [1 ,2 ]
机构
[1] Chinese Acad Sci, CAS, Inst Comp Technol, Key Lab Intelligent Informat Proc, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
关键词
MASS SPECTROMETRISTS; PEPTIDES; TANDEM; IDENTIFICATION; METAPROTEOMICS; PROTEINS; IDENTIFY; SPECTRA;
D O I
10.1016/j.ijms.2022.116966
中图分类号
O64 [物理化学(理论化学)、化学物理学]; O56 [分子物理学、原子物理学];
学科分类号
070203 ; 070304 ; 081704 ; 1406 ;
摘要
Mass spectrometry-based metaproteomic and proteogenomic studies tend to use large-scale databases that may contain too many irrelevant or artificially constructed proteins. Such an imprecise database presents challenges for both the quality of peptide identification and the time consumption. To address them, we developed a database reduction method for iterative database searching, DBReducer, which can precisely and effectively reduce the large-scale database and is allowed to interface with any down-stream database search engine. In addition, an entrapment strategy was introduced to evaluate the identification precision and recall of different search modes. Compared with the common one-step database search and the traditional iterative database search, the iterative search with DBReducer respectively improved the peptide identification recall from an average of 67.8% and 83.7%-93.5%, and respectively improved the peptide identification precision from an average of 91.1% and 89.6%-91.3%, and more importantly, using DBReducer respectively reduced the time consumption by an average of 57.7% and 68.2%. Our results indicate that DBReducer has the potential to be a widely used database reduction method prior to common proteomic analysis, especially for scenarios with large-scale databases.(c) 2022 Published by Elsevier B.V.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Assessing Face Image Quality: A Large-Scale Database and a Transformer Method
    Liu, Tie
    Li, Shengxi
    Xu, Mai
    Yang, Li
    Wang, Xiaofei
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3981 - 4000
  • [22] An improvement of database with local search mechanisms for genetic algorithms in large-scale computing environments
    Hanada, Y
    Hiroyasu, T
    Miki, M
    2005 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-3, PROCEEDINGS, 2005, : 1974 - 1981
  • [23] The Terabase Search Engine: a large-scale relational database of short-read sequences
    Wilton, Richard
    Wheelan, Sarah J.
    Szalay, Alexander S.
    Salzberg, Steven L.
    BIOINFORMATICS, 2019, 35 (04) : 665 - 670
  • [24] Attention Based Glaucoma Detection: A Large-scale Database and CNN Model
    Li, Liu
    Xu, Mai
    Wang, Xiaofei
    Jiang, Lai
    Liu, Hanruo
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10563 - 10572
  • [25] On Efficient Tree-Based Tag Search in Large-Scale RFID Systems
    Yu, Jihong
    Gong, Wei
    Liu, Jiangchuan
    Chen, Lin
    Wang, Kehao
    IEEE-ACM TRANSACTIONS ON NETWORKING, 2019, 27 (01) : 42 - 55
  • [26] Large-Scale Evolution Strategy Based on Search Direction Adaptation
    He, Xiaoyu
    Zhou, Yuren
    Chen, Zefeng
    Zhang, Jun
    Chen, Wei-Neng
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1651 - 1665
  • [27] Accelerating Large-Scale Biological Database Search on Xeon Phi-based Neo-Heterogeneous Architectures
    Lan, Haidong
    Liu, Weiguo
    Schmidt, Bertil
    Wang, Bingqiang
    PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2015, : 503 - 510
  • [28] A Discriminant Color Space Method for Face Representation and Verification on a Large-scale Database
    Yang, Jian
    Liu, Chengjun
    19TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOLS 1-6, 2008, : 2804 - 2807
  • [29] Effective and efficient melody-matching method in a large-scale music database
    Heo, SP
    ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS 2004: OTM 2004 WORKSHOPS, PROCEEDINGS, 2004, 3292 : 32 - 33
  • [30] Effective and efficient melody-matching method in a large-scale music database
    Heo, Sung-Phil
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3292 : 32 - 33