Development and rigorous validation of antimalarial predictive models using machine learning approaches

被引:11
|
作者
Danishuddin [1 ]
Madhukar, G. [1 ]
Malik, M. Z. [1 ]
Subbarao, N. [1 ]
机构
[1] Jawaharlal Nehru Univ, Sch Computat & Integrat Sci, New Delhi, India
关键词
Antimalarial; predictive models; machine learning; calibration; predictiveness curve; ARTEMISININ RESISTANCE; DISCOVERY; IDENTIFICATION; QSAR;
D O I
10.1080/1062936X.2019.1635526
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The large collection of known and experimentally verified compounds from the ChEMBL database was used to build different classification models for predicting the antimalarial activity against Plasmodium falciparum. Four different machine learning methods, namely the support vector machine (SVM), random forest (RF), k-nearest neighbour (kNN) and XGBoost have been used for the development of models using the diverse antimalarial dataset from ChEMBL. A well-established feature selection framework was used to select the best subset from a larger pool of descriptors. Performance of the models was rigorously evaluated by evaluation of the applicability domain, Y-scrambling and AUC-ROC curve. Additionally, the predictive power of the models was also assessed using probability calibration and predictiveness curves. SVM and XGBoost showed the best performances, yielding an accuracy of 85% on the independent test set. In term of probability prediction, SVM and XGBoost were well calibrated. Total gain (TG) from the predictiveness curve was more related to SVM (TG = 0.67) and XGBoost (TG = 0.75). These models also predict the high-affinity compounds from PubChem antimalarial bioassay (as external validation) with a high probability score. Our findings suggest that the selected models are robust and can be potentially useful for facilitating the discovery of antimalarial agents.
引用
收藏
页码:543 / 560
页数:18
相关论文
共 50 条
  • [11] Development of Predictive Models in Patients with Epiphora Using Lacrimal Scintigraphy and Machine Learning
    Yong-Jin Park
    Ji Hoon Bae
    Mu Heon Shin
    Seung Hyup Hyun
    Young Seok Cho
    Yearn Seong Choe
    Joon Young Choi
    Kyung-Han Lee
    Byung-Tae Kim
    Seung Hwan Moon
    Nuclear Medicine and Molecular Imaging, 2019, 53 : 125 - 135
  • [12] Development and validation of HBV surveillance models using big data and machine learning
    Dong, Weinan
    Da Roza, Cecilia Clara
    Cheng, Dandan
    Zhang, Dahao
    Xiang, Yuling
    Seto, Wai Kay
    Wong, William C. W.
    ANNALS OF MEDICINE, 2024, 56 (01)
  • [13] Development of Predictive Models using Machine Learning Algorithms for Food Adulterants Bacteria Detection
    Amado, Timothy M.
    Burman, Ma Rica
    Chicote, Relamae F.
    Espenida, Sheila May C.
    Masangcay, Honeyleth L.
    Ventura, Camille H.
    Tolentino, Lean Karlo S.
    Padilla, Maria Victoria C.
    Madrigal, Gilfred Allen M.
    Enriquez, Lejan Alfred C.
    2019 IEEE 11TH INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY, COMMUNICATION AND CONTROL, ENVIRONMENT, AND MANAGEMENT (HNICEM), 2019,
  • [14] Development of predictive models for personalized, precision medicine in colorectal cancer using machine learning
    Hung, Man
    Hon, Shirley
    Gu, Yushan
    Bounsanga, Jerry
    Hon, Eric
    Hansen, Alec R.
    Nielson, Dominique
    Voss, Maren
    QUALITY OF LIFE RESEARCH, 2017, 26 (01) : 65 - 65
  • [15] Development of predictive models for density of hybrid nanofluids using different machine learning techniques
    Gupta, Amit Kumar
    Mathur, Priya
    Oyedeji, Mojeed Opeyemi
    Alade, Ibrahim Olanrewaju
    Qahtan, Talal F.
    Gupta, Sparsh
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART E-JOURNAL OF PROCESS MECHANICAL ENGINEERING, 2023, 237 (05) : 1722 - 1739
  • [16] A comprehensive analysis of stroke risk factors and development of a predictive model using machine learning approaches
    Songquan Xie
    Shuting Peng
    Long Zhao
    Binbin Yang
    Yukun Qu
    Xiaoping Tang
    Molecular Genetics and Genomics, 2025, 300 (1)
  • [17] Predictive modeling for the development of diabetes mellitus using key factors in various machine learning approaches
    Tanaka, Marenao
    Akiyama, Yukinori
    Mori, Kazuma
    Hosaka, Itaru
    Kato, Kenichi
    Endo, Keisuke
    Ogawa, Toshifumi
    Sato, Tatsuya
    Suzuki, Toru
    Yano, Toshiyuki
    Ohnishi, Hirofumi
    Hanawa, Nagisa
    Furuhashi, Masato
    DIABETES EPIDEMIOLOGY AND MANAGEMENT, 2024, 13
  • [18] Development and validation of a multivariate predictive model for rheumatoid arthritis mortality using a machine learning approach
    Lezcano-Valverde, Jose M.
    Salazar, Fernando
    Leon, Leticia
    Toledano, Esther
    Jover, Juan A.
    Fernandez-Gutierrez, Benjamin
    Soudah, Eduardo
    Gonzalez-Alvaro, Isidoro
    Abasolo, Lydia
    Rodriguez-Rodriguez, Luis
    SCIENTIFIC REPORTS, 2017, 7
  • [19] Development and validation of a multivariate predictive model for rheumatoid arthritis mortality using a machine learning approach
    José M. Lezcano-Valverde
    Fernando Salazar
    Leticia León
    Esther Toledano
    Juan A. Jover
    Benjamín Fernandez-Gutierrez
    Eduardo Soudah
    Isidoro González-Álvaro
    Lydia Abasolo
    Luis Rodriguez-Rodriguez
    Scientific Reports, 7
  • [20] A machine learning predictive model for recurrence of resected distal cholangiocarcinoma: Development and validation of predictive model using artificial intelligence
    Perez, Marc
    Hansen, Carsten Palnaes
    Burdio, Fernando
    Sanchez-Velazquez, Patricia
    Giuliani, Antonio
    Lancellotti, Francesco
    de Liguori-Carino, Nicola
    Malleo, Giuseppe
    Marchegiani, Giovanni
    Podda, Mauro
    Pisanu, Adolfo
    De Luca, Giuseppe Massimiliano
    Anselmo, Alessandro
    Siragusa, Leandro
    Burgdorf, Stefan Kobbelgaard
    Tschuor, Christoph
    Cacciaguerra, Andrea Benedetti
    Koh, Ye Xin
    Masuda, Yoshio
    Xuan, Mark Yeo Hao
    Seeger, Nico
    Breitenstein, Stefan
    Grochola, Filip Lukasz
    Di Martino, Marcello
    Secanella, Luis
    Busquets, Juli
    Dorcaratto, Dimitri
    Mora-Oliver, Isabel
    Ingallinella, Sara
    Salvia, Roberto
    Abu Hilal, Mohammad
    Aldrighetti, Luca
    Ielpo, Benedetto
    EJSO, 2024, 50 (07):