Sequence-based Prediction of Antimicrobial Peptides with CatBoost Classifier

被引:0
|
作者
Yu, Jen-Chieh [1 ]
Ni, Kuan [1 ]
Chen, Ching-Tai [2 ]
机构
[1] Asia Univ, Dept Bioinformat & Med Engn, Taichung, Taiwan
[2] Asia Univ, Dept Bioinformat & Med Engn, Ctr Precis Hlth Res, Taichung, Taiwan
关键词
antimicrobial peptide prediction; therapentic peptide; disease; machine learning; bioinformatics; AMINO-ACID-COMPOSITION; FEATURE-SELECTION; PROTEIN; ANTIBACTERIAL;
D O I
10.1109/BIBE55377.2022.00053
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Antimicrobial resistance is one of the most serious issue for human health. Compared to existing antibiotics, antimicrobial peptides have the advantage of efficient killing microbes and other pathogens without inducing drug resistance. Large-scale experimental methods to characterize AMPs require wet-lab resources and longer time. In silico prediction of AMP, on the other hand, is an attractive strategy to lower the cost and time in the discovery of new AMPs. In this study, we proposed a CatBoost model for AMP prediction. We included various features for numerical representation of peptides, and then employed a systematic approach to select 130 important features for our machine learning models. The CatBoost model achieves an accuracy, F1-score, MCC, and AUC of 0.758, 0.750, 0.518, and 0.831, respectively, for cross validation. For an independent test based on 188 peptide sequences, the proposed model achieves an accuracy, MCC, and AUC of 0.814, 0.632, and 0.884, respectively, all of which are the best compared to five state-of-art methods. Our model improves the MCC of five existing methods by 2.6% to 21.1%, and improves the AUC of them by 1.3% to 13.3%, respectively. The results demonstrate that our CatBoost model is capable of yielding reliable results, and can be of great help in discovering novel AMPs.
引用
收藏
页码:217 / 220
页数:4
相关论文
共 50 条
  • [31] AtbPpred: A Robust Sequence-Based Prediction of Anti-Tubercular Peptides Using Extremely Randomized Trees
    Manavalan, Balachandran
    Basith, Shaherin
    Shin, Tae Hwan
    Wei, Leyi
    Lee, Gwang
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2019, 17 : 972 - 981
  • [32] Sequence-Based Prediction of Food-Originated ACE Inhibitory Peptides Using Deep Learning Algorithm
    Terziyska, Margarita
    Desseva, Ivelina
    Terziyski, Zhelyazko
    CONTEMPORARY METHODS IN BIOINFORMATICS AND BIOMEDICINE AND THEIR APPLICATIONS, 2022, 374 : 236 - 246
  • [33] A Novel Hybrid Sequence-Based Model for Identifying Anticancer Peptides
    Xu, Lei
    Liang, Guangmin
    Wang, Longjie
    Liao, Changrui
    GENES, 2018, 9 (03)
  • [34] SeqTMPPI: Sequence-Based Transmembrane Protein Interaction Prediction
    Wang, Han
    Jiang, Jiuhong
    Chen, Qiufen
    Zhang, Chunhua
    Lu, Chang
    Ma, Zhiqiang
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 96 - 99
  • [35] Sequence-based prediction in conceptual design of bridges - Discussion
    Fu, CC
    JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 1999, 13 (01) : 54 - 54
  • [36] BPP: a sequence-based algorithm for branch point prediction
    Zhang, Qing
    Fan, Xiaodan
    Wang, Yejun
    Sun, Ming-an
    Shao, Jianlin
    Guo, Dianjing
    BIOINFORMATICS, 2017, 33 (20) : 3166 - 3172
  • [37] Sequence-Based Prediction of Type III Secreted Proteins
    Arnold, Roland
    Brandmaier, Stefan
    Kleine, Frederick
    Tischler, Patrick
    Heinz, Eva
    Behrens, Sebastian
    Niinikoski, Antti
    Mewes, Hans-Werner
    Horn, Matthias
    Rattei, Thomas
    PLOS PATHOGENS, 2009, 5 (04)
  • [38] Sequence-based prediction in conceptual design of bridges - Closure
    Wang, WY
    Gero, JS
    JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 1999, 13 (01) : 55 - 56
  • [39] Sequence-Based Prediction of Promiscuous Acyltransferase Activity in Hydrolases
    Mueller, Henrik
    Becker, Ann-Kristin
    Palm, Gottfried J.
    Berndt, Leona
    Badenhorst, Christoffel P. S.
    Godehard, Simon P.
    Reisky, Lukas
    Lammers, Michael
    Bornscheuer, Uwe T.
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2020, 59 (28) : 11607 - 11612
  • [40] Recent advances in sequence-based protein structure prediction
    Dukka, B. K. C.
    BRIEFINGS IN BIOINFORMATICS, 2017, 18 (06) : 1021 - 1032