Artificial intelligence paradigm for ligand-based virtual screening on the drug discovery of type 2 diabetes mellitus

被引:9
|
作者
Bustamam, Alhadi [1 ]
Hamzah, Haris [1 ]
Husna, Nadya A. [1 ]
Syarofina, Sarah [1 ]
Dwimantara, Nalendra [1 ]
Yanuar, Arry [2 ]
Sarwinda, Devvi [1 ]
机构
[1] Univ Indonesia, Fac Math & Nat Sci, Dept Math, Depok, Indonesia
[2] Univ Indonesia, Fac Pharm, Gedung A Rumpun Ilmu Kesehatan Lantai 1, Depok, Indonesia
关键词
Quantitative structure-activity relationship; K-modes clustering; CatBoost; Rotation Forest; principal component analysis; Sparse principal component analysis; Deep neural network; Fingerprint; PHYSICOCHEMICAL PARAMETERS; ROTATION FOREST; QSAR;
D O I
10.1186/s40537-021-00465-3
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Background New dipeptidyl peptidase-4 (DPP-4) inhibitors need to be developed to be used as agents with low adverse effects for the treatment of type 2 diabetes mellitus. This study aims to build quantitative structure-activity relationship (QSAR) models using the artificial intelligence paradigm. Rotation Forest and Deep Neural Network (DNN) are used to predict QSAR models. We compared principal component analysis (PCA) with sparse PCA (SPCA) as methods for transforming Rotation Forest. K-modes clustering with Levenshtein distance was used for the selection method of molecules, and CatBoost was used for the feature selection method. Results The amount of the DPP-4 inhibitor molecules resulting from the selection process of molecules using K-Modes clustering algorithm is 1020 with logP range value of -1.6693 to 4.99044. Several fingerprint methods such as extended connectivity fingerprint and functional class fingerprint with diameters of 4 and 6 were used to construct four fingerprint datasets, ECFP_4, ECFP_6, FCFP_4, and FCFP_6. There are 1024 features from the four fingerprint datasets that are then selected using the CatBoost method. CatBoost can represent QSAR models with good performance for machine learning and deep learning methods respectively with evaluation metrics, such as Sensitivity, Specificity, Accuracy, and Matthew's correlation coefficient, all valued above 70% with a feature importance level of 60%, 70%, 80%, and 90%. Conclusion The K-modes clustering algorithm can produce a representative subset of DPP-4 inhibitor molecules. Feature selection in the fingerprint dataset using CatBoost is best used before making QSAR Classification and QSAR Regression models. QSAR Classification using Machine Learning and QSAR Classification using Deep Learning, each of which has an accuracy of above 70%. The QSAR RFC-PCA and QSAR RFR-PCA models performed better than QSAR RFC-SPCA and QSAR RFR-SPCA models because QSAR RFC-PCA and QSAR RFR-PCA models have more effective time than the QSAR RFC-SPCA and QSAR RFR-SPCA models.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Ligand docking and structure-based virtual screening in drug discovery
    Cavasotto, Claudio N.
    Orry, Andrew J. W.
    CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2007, 7 (10) : 1006 - 1014
  • [42] Ligand docking and virtual screening in structure-based drug discovery
    Cavasotto, Claudio N.
    FROM PHYSICS TO BIOLOGY: THE INTERFACE BETWEEN EXPERIMENT AND COMPUTATION, 2006, 851 : 34 - 49
  • [43] Ligand-Based and Structure-Based Virtual Screening of New Sodium Glucose Cotransporter Type 2 Inhibitors
    Estrada, Ana Karen
    Mendez-Alvarez, Domingo
    Juarez-Saldivar, Alfredo
    Lara-Ramirez, Edgar E.
    Martinez-Vazquez, Ana Veronica
    Villalobos-Rocha, Juan Carlos
    Palos, Isidro
    Ortiz-Perez, Eyra
    Rivera, Gildardo
    MEDICINAL CHEMISTRY, 2023, 19 (10) : 1049 - 1060
  • [44] Ligand-Based Virtual Screening Based on the Graph Edit Distance
    Rica, Elena
    Alvarez, Susana
    Serratosa, Francesc
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2021, 22 (23)
  • [45] Discovery of High-Affinity Amyloid Ligands Using a Ligand-Based Virtual Screening Pipeline
    Chisholm, Timothy S. S.
    Mackey, Mark
    Hunter, Christopher A. A.
    JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2023, 145 (29) : 15936 - 15950
  • [46] Discovery of novel small molecule EGFR inhibitory leads by structure and ligand-based virtual screening
    Priya Mahajan
    Nitasha Suri
    Rukmankesh Mehra
    Monika Gupta
    Amit Kumar
    Shashank Kr. Singh
    Amit Nargotra
    Medicinal Chemistry Research, 2017, 26 : 74 - 92
  • [47] Ligand expansion in ligand-based virtual screening using relevance feedback
    Abdo, Ammar
    Saeed, Faisal
    Hamza, Hentabli
    Ahmed, Ali
    Salim, Naomie
    JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN, 2012, 26 (03) : 279 - 287
  • [48] Ligand expansion in ligand-based virtual screening using relevance feedback
    Ammar Abdo
    Faisal Saeed
    Hentabli Hamza
    Ali Ahmed
    Naomie Salim
    Journal of Computer-Aided Molecular Design, 2012, 26 : 279 - 287
  • [49] Risk factors and drug discovery for cognitive impairment in type 2 diabetes mellitus using artificial intelligence interpretation and graph neural networks
    Zhang, Xin
    Xie, Jiajia
    You, Xiong
    Gong, Houwu
    FRONTIERS IN ENDOCRINOLOGY, 2023, 14
  • [50] Structure-Based Drug Screening and Ligand-Based Drug Screening with Machine Learning
    Fukunishi, Yoshifumi
    COMBINATORIAL CHEMISTRY & HIGH THROUGHPUT SCREENING, 2009, 12 (04) : 397 - 408