Sequence-based Prediction of Antimicrobial Peptides with CatBoost Classifier

被引:0
|
作者
Yu, Jen-Chieh [1 ]
Ni, Kuan [1 ]
Chen, Ching-Tai [2 ]
机构
[1] Asia Univ, Dept Bioinformat & Med Engn, Taichung, Taiwan
[2] Asia Univ, Dept Bioinformat & Med Engn, Ctr Precis Hlth Res, Taichung, Taiwan
关键词
antimicrobial peptide prediction; therapentic peptide; disease; machine learning; bioinformatics; AMINO-ACID-COMPOSITION; FEATURE-SELECTION; PROTEIN; ANTIBACTERIAL;
D O I
10.1109/BIBE55377.2022.00053
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Antimicrobial resistance is one of the most serious issue for human health. Compared to existing antibiotics, antimicrobial peptides have the advantage of efficient killing microbes and other pathogens without inducing drug resistance. Large-scale experimental methods to characterize AMPs require wet-lab resources and longer time. In silico prediction of AMP, on the other hand, is an attractive strategy to lower the cost and time in the discovery of new AMPs. In this study, we proposed a CatBoost model for AMP prediction. We included various features for numerical representation of peptides, and then employed a systematic approach to select 130 important features for our machine learning models. The CatBoost model achieves an accuracy, F1-score, MCC, and AUC of 0.758, 0.750, 0.518, and 0.831, respectively, for cross validation. For an independent test based on 188 peptide sequences, the proposed model achieves an accuracy, MCC, and AUC of 0.814, 0.632, and 0.884, respectively, all of which are the best compared to five state-of-art methods. Our model improves the MCC of five existing methods by 2.6% to 21.1%, and improves the AUC of them by 1.3% to 13.3%, respectively. The results demonstrate that our CatBoost model is capable of yielding reliable results, and can be of great help in discovering novel AMPs.
引用
收藏
页码:217 / 220
页数:4
相关论文
共 50 条
  • [41] Sequence-Based Prediction of Transmembrane Protein Crystallization Propensity
    Qizhi Zhu
    Lihua Wang
    Ruyu Dai
    Wei Zhang
    Wending Tang
    Yannan Bin
    Zeliang Wang
    Junfeng Xia
    Interdisciplinary Sciences: Computational Life Sciences, 2021, 13 : 693 - 702
  • [42] Sequence-based prediction of transcription upregulation by auxin in plants
    Ponomarenko, Petr M.
    Ponomarenko, Mikhail P.
    JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2015, 13 (01)
  • [43] ThermoFinder: A sequence-based thermophilic proteins prediction framework
    Yu, Han
    Luo, Xiaozhou
    INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES, 2024, 270
  • [44] Sequence-Based Prediction of Promiscuous Acyltransferase Activity in Hydrolases
    Department of Biotechnology & Enzyme Catalysis Institute of Biochemistry, University of Greifswald, Greifswald
    17487, Germany
    不详
    17487, Germany
    不详
    17487, Germany
    Adv Mater, 2020, 28 (11704-11709): : 11704 - 11709
  • [45] Sequence-Based Prediction of Transmembrane Protein Crystallization Propensity
    Zhu, Qizhi
    Wang, Lihua
    Dai, Ruyu
    Zhang, Wei
    Tang, Wending
    Bin, Yannan
    Wang, Zeliang
    Xia, Junfeng
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2021, 13 (04) : 693 - 702
  • [46] SOLpro: accurate sequence-based prediction of protein solubility
    Magnan, Christophe N.
    Randall, Arlo
    Baldi, Pierre
    BIOINFORMATICS, 2009, 25 (17) : 2200 - 2207
  • [47] Sequence-Based Viscosity Prediction for Rapid Antibody Engineering
    Estes, Bram
    Jain, Mani
    Jia, Lei
    Whoriskey, John
    Bennett, Brian
    Hsu, Hailing
    BIOMOLECULES, 2024, 14 (06)
  • [48] Sequence-based prediction of protein binding mode landscapes
    Horvath, Attila
    Miskei, Marton
    Ambrusl, Viktor
    Vendruscolo, Michele
    Fuxreiter, Monika
    PLOS COMPUTATIONAL BIOLOGY, 2020, 16 (05)
  • [49] BBPpred: Sequence-Based Prediction of Blood-Brain Barrier Peptides with Feature Representation Learning and Logistic Regression
    Dai, Ruyu
    Zhang, Wei
    Tang, Wending
    Wynendaele, Evelien
    Zhu, Qizhi
    Bin, Yannan
    De Spiegeleer, Bart
    Xia, Junfeng
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2021, 61 (01) : 525 - 534
  • [50] ATPsite: sequence-based prediction of ATP-binding residues
    Chen, Ke
    Mizianty, Marcin J.
    Kurgan, Lukasz
    PROTEOME SCIENCE, 2011, 9