Sequence-based Prediction of Antimicrobial Peptides with CatBoost Classifier

被引:0
|
作者
Yu, Jen-Chieh [1 ]
Ni, Kuan [1 ]
Chen, Ching-Tai [2 ]
机构
[1] Asia Univ, Dept Bioinformat & Med Engn, Taichung, Taiwan
[2] Asia Univ, Dept Bioinformat & Med Engn, Ctr Precis Hlth Res, Taichung, Taiwan
关键词
antimicrobial peptide prediction; therapentic peptide; disease; machine learning; bioinformatics; AMINO-ACID-COMPOSITION; FEATURE-SELECTION; PROTEIN; ANTIBACTERIAL;
D O I
10.1109/BIBE55377.2022.00053
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Antimicrobial resistance is one of the most serious issue for human health. Compared to existing antibiotics, antimicrobial peptides have the advantage of efficient killing microbes and other pathogens without inducing drug resistance. Large-scale experimental methods to characterize AMPs require wet-lab resources and longer time. In silico prediction of AMP, on the other hand, is an attractive strategy to lower the cost and time in the discovery of new AMPs. In this study, we proposed a CatBoost model for AMP prediction. We included various features for numerical representation of peptides, and then employed a systematic approach to select 130 important features for our machine learning models. The CatBoost model achieves an accuracy, F1-score, MCC, and AUC of 0.758, 0.750, 0.518, and 0.831, respectively, for cross validation. For an independent test based on 188 peptide sequences, the proposed model achieves an accuracy, MCC, and AUC of 0.814, 0.632, and 0.884, respectively, all of which are the best compared to five state-of-art methods. Our model improves the MCC of five existing methods by 2.6% to 21.1%, and improves the AUC of them by 1.3% to 13.3%, respectively. The results demonstrate that our CatBoost model is capable of yielding reliable results, and can be of great help in discovering novel AMPs.
引用
收藏
页码:217 / 220
页数:4
相关论文
共 50 条
  • [1] AmPEP: Sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest
    Bhadra, Pratiti
    Yan, Jielu
    Li, Jinyan
    Fong, Simon
    Siu, Shirley W. I.
    SCIENTIFIC REPORTS, 2018, 8
  • [2] AmPEP: Sequence-based prediction of antimicrobial peptides using distribution patterns of amino acid properties and random forest
    Pratiti Bhadra
    Jielu Yan
    Jinyan Li
    Simon Fong
    Shirley W. I. Siu
    Scientific Reports, 8
  • [3] TargetAntiAngio: A Sequence-Based Tool for the Prediction and Analysis of Anti-Angiogenic Peptides
    Laengsri, Vishuda
    Nantasenamat, Chanin
    Schaduangrat, Nalini
    Nuchnoi, Pornlada
    Prachayasittikul, Virapong
    Shoombuatong, Watshara
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2019, 20 (12)
  • [4] A Novel Method to Improve Recognition of Antimicrobial Peptides through Distal Sequence-based Features
    Veltri, Daniel
    Kamath, Uday
    Shehu, Amarda
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [5] Sequence-based prediction of protein domains
    Liu, JF
    Rost, B
    NUCLEIC ACIDS RESEARCH, 2004, 32 (12) : 3522 - 3530
  • [6] Sequence-based prediction of variants’ effects
    Nicole Rusk
    Nature Methods, 2018, 15 : 571 - 571
  • [7] Sequence-Based Prediction of Protein Solubility
    Agostini, Federico
    Vendruscolo, Michele
    Tartaglia, Gian Gaetano
    JOURNAL OF MOLECULAR BIOLOGY, 2012, 421 (2-3) : 237 - 241
  • [8] Sequence-based prediction of variants' effects
    Rusk, Nicole
    NATURE METHODS, 2018, 15 (07) : 571 - 571
  • [9] Sequence-based prediction of pathological mutations
    Ferrer-Costa, C
    Orozco, M
    de la Cruz, X
    PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2004, 57 (04) : 811 - 819
  • [10] AIPpred: Sequence-Based Prediction of Anti-inflammatory Peptides Using Random Forest
    Manavalan, Balachandran
    Shin, Tae H.
    Kim, Myeong O.
    Lee, Gwang
    FRONTIERS IN PHARMACOLOGY, 2018, 9