Drug-Target Interactions Prediction at Scale: The Komet Algorithm with the LCIdb Dataset

被引:0
|
作者
Guichaoua, Gwenn [1 ,2 ,3 ]
Pinel, Philippe [1 ,2 ,3 ,4 ]
Hoffmann, Brice [4 ]
Azencott, Chloe-Agathe [1 ,2 ,3 ]
Stoven, Veronique [1 ,2 ,3 ]
机构
[1] Mines Paris PSL, Ctr Computat Biol CBIO, F-75006 Paris, France
[2] Univ PSL, Inst Curie, F-75005 Paris, France
[3] INSERM, U900, F-75005 Paris, France
[4] Iktos SAS, F-75017 Paris, France
关键词
CHEMICAL-STRUCTURE; PROTEIN; DATABASE; LANGUAGE;
D O I
10.1021/acs.jcim.4c00422
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Drug-target interactions (DTIs) prediction algorithms are used at various stages of the drug discovery process. In this context, specific problems such as deorphanization of a new therapeutic target or target identification of a drug candidate arising from phenotypic screens require large-scale predictions across the protein and molecule spaces. DTI prediction heavily relies on supervised learning algorithms that use known DTIs to learn associations between molecule and protein features, allowing for the prediction of new interactions based on learned patterns. The algorithms must be broadly applicable to enable reliable predictions, even in regions of the protein or molecule spaces where data may be scarce. In this paper, we address two key challenges to fulfill these goals: building large, high-quality training datasets and designing prediction methods that can scale, in order to be trained on such large datasets. First, we introduce LCIdb, a curated, large-sized dataset of DTIs, offering extensive coverage of both the molecule and druggable protein spaces. Notably, LCIdb contains a much higher number of molecules than publicly available benchmarks, expanding coverage of the molecule space. Second, we propose Komet (Kronecker Optimized METhod), a DTI prediction pipeline designed for scalability without compromising performance. Komet leverages a three-step framework, incorporating efficient computation choices tailored for large datasets and involving the Nystrom approximation. Specifically, Komet employs a Kronecker interaction module for (molecule, protein) pairs, which efficiently captures determinants in DTIs, and whose structure allows for reduced computational complexity and quasi-Newton optimization, ensuring that the model can handle large training sets, without compromising on performance. Our method is implemented in open-source software, leveraging GPU parallel computation for efficiency. We demonstrate the interest of our pipeline on various datasets, showing that Komet displays superior scalability and prediction performance compared to state-of-the-art deep learning approaches. Additionally, we illustrate the generalization properties of Komet by showing its performance on an external dataset, and on the publicly available L H benchmark designed for scaffold hopping problems.
引用
收藏
页码:6938 / 6956
页数:19
相关论文
共 50 条
  • [41] LDS-CNN: a deep learning framework for drug-target interactions prediction based on large-scale drug screening
    Wang, Yang
    Zhang, Zuxian
    Piao, Chenghong
    Huang, Ying
    Zhang, Yihan
    Zhang, Chi
    Lu, Yu-Jing
    Liu, Dongning
    HEALTH INFORMATION SCIENCE AND SYSTEMS, 2023, 11 (01)
  • [42] A Comparative Analytical Review on Machine Learning Methods in Drug-target Interactions Prediction
    Nikraftar, Zahra
    Keyvanpour, Mohammad Reza
    CURRENT COMPUTER-AIDED DRUG DESIGN, 2023, 19 (05) : 325 - 355
  • [43] Drug-target interactions prediction using marginalized denoising model on heterogeneous networks
    Chunyan Tang
    Cheng Zhong
    Danyang Chen
    Jianyi Wang
    BMC Bioinformatics, 21
  • [44] Computational prediction of drug-target interactions using chemogenomic approaches: an empirical survey
    Ezzat, Ali
    Wu, Min
    Li, Xiao-Li
    Kwoh, Chee-Keong
    BRIEFINGS IN BIOINFORMATICS, 2019, 20 (04) : 1337 - 1357
  • [45] Research on Drug-Target Interactions Prediction: Network similarity-based approaches
    Hong Bingjie
    Abbas, Khushnood
    Niu Ling
    Abbas, Syed Jafar
    PROCEEDINGS OF 2020 IEEE 10TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2020), 2020, : 168 - 173
  • [46] Drug-Target Interactions Prediction Based on Signed Heterogeneous Graph Neural Networks
    Chen, Ming
    Jiang, Yajian
    Lei, Xiujuan
    Pan, Yi
    Ji, Chunyan
    Jiang, Wei
    CHINESE JOURNAL OF ELECTRONICS, 2024, 33 (01) : 231 - 244
  • [47] Drug-Target Interactions Prediction Based on Signed Heterogeneous Graph Neural Networks
    Ming CHEN
    Yajian JIANG
    Xiujuan LEI
    Yi PAN
    Chunyan JI
    Wei JIANG
    Chinese Journal of Electronics, 2024, 33 (01) : 231 - 244
  • [48] Drug-target interactions prediction using marginalized denoising model on heterogeneous networks
    Tang, Chunyan
    Zhong, Cheng
    Chen, Danyang
    Wang, Jianyi
    BMC BIOINFORMATICS, 2020, 21 (01)
  • [49] A Systematic Prediction of Drug-Target Interactions Using Molecular Fingerprints and Protein Sequences
    Huang, Yu-An
    You, Zhu-Hong
    Chen, Xing
    CURRENT PROTEIN & PEPTIDE SCIENCE, 2018, 19 (05) : 468 - 478
  • [50] Sequence-based prediction of protein binding regions and drug-target interactions
    Lee, Ingoo
    Nam, Hojung
    JOURNAL OF CHEMINFORMATICS, 2022, 14 (01)