Drug-Target Interactions Prediction at Scale: The Komet Algorithm with the LCIdb Dataset

被引:0
|
作者
Guichaoua, Gwenn [1 ,2 ,3 ]
Pinel, Philippe [1 ,2 ,3 ,4 ]
Hoffmann, Brice [4 ]
Azencott, Chloe-Agathe [1 ,2 ,3 ]
Stoven, Veronique [1 ,2 ,3 ]
机构
[1] Mines Paris PSL, Ctr Computat Biol CBIO, F-75006 Paris, France
[2] Univ PSL, Inst Curie, F-75005 Paris, France
[3] INSERM, U900, F-75005 Paris, France
[4] Iktos SAS, F-75017 Paris, France
关键词
CHEMICAL-STRUCTURE; PROTEIN; DATABASE; LANGUAGE;
D O I
10.1021/acs.jcim.4c00422
中图分类号
R914 [药物化学];
学科分类号
100701 ;
摘要
Drug-target interactions (DTIs) prediction algorithms are used at various stages of the drug discovery process. In this context, specific problems such as deorphanization of a new therapeutic target or target identification of a drug candidate arising from phenotypic screens require large-scale predictions across the protein and molecule spaces. DTI prediction heavily relies on supervised learning algorithms that use known DTIs to learn associations between molecule and protein features, allowing for the prediction of new interactions based on learned patterns. The algorithms must be broadly applicable to enable reliable predictions, even in regions of the protein or molecule spaces where data may be scarce. In this paper, we address two key challenges to fulfill these goals: building large, high-quality training datasets and designing prediction methods that can scale, in order to be trained on such large datasets. First, we introduce LCIdb, a curated, large-sized dataset of DTIs, offering extensive coverage of both the molecule and druggable protein spaces. Notably, LCIdb contains a much higher number of molecules than publicly available benchmarks, expanding coverage of the molecule space. Second, we propose Komet (Kronecker Optimized METhod), a DTI prediction pipeline designed for scalability without compromising performance. Komet leverages a three-step framework, incorporating efficient computation choices tailored for large datasets and involving the Nystrom approximation. Specifically, Komet employs a Kronecker interaction module for (molecule, protein) pairs, which efficiently captures determinants in DTIs, and whose structure allows for reduced computational complexity and quasi-Newton optimization, ensuring that the model can handle large training sets, without compromising on performance. Our method is implemented in open-source software, leveraging GPU parallel computation for efficiency. We demonstrate the interest of our pipeline on various datasets, showing that Komet displays superior scalability and prediction performance compared to state-of-the-art deep learning approaches. Additionally, we illustrate the generalization properties of Komet by showing its performance on an external dataset, and on the publicly available L H benchmark designed for scaffold hopping problems.
引用
收藏
页码:6938 / 6956
页数:19
相关论文
共 50 条
  • [1] Drug-Target Interactions: Prediction Methods and Applications
    Anusuya, Shanmugam
    Kesherwani, Manish
    Priya, K. Vishnu
    Vimala, Antonydhason
    Shanmugam, Gnanendra
    Velmurugan, Devadasan
    Gromiha, M. Michael
    CURRENT PROTEIN & PEPTIDE SCIENCE, 2018, 19 (06) : 537 - 561
  • [2] Large-Scale Prediction of Drug-Target Interactions from Deep Representations
    Hu, Peng-Wei
    Chan, Keith C. C.
    You, Zhu-Hong
    2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 1236 - 1243
  • [3] Computational Prediction of Drug-Target Interactions in Medicinal Chemistry
    Gonzalez-Diaz, Humberto
    CURRENT TOPICS IN MEDICINAL CHEMISTRY, 2013, 13 (14) : 1619 - 1621
  • [4] Large-scale prediction of drug-target relationships
    Kuhn, Michael
    Campillos, Monica
    Gonzalez, Paula
    Jensen, Lars Juhl
    Bork, Peer
    FEBS LETTERS, 2008, 582 (08) : 1283 - 1290
  • [5] Ensemble Learning Algorithm for Drug-Target Interaction Prediction
    Pathak, Sudipta
    Cai, Xingyu
    2017 IEEE 7TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ADVANCES IN BIO AND MEDICAL SCIENCES (ICCABS), 2017,
  • [6] Large-scale prediction of drug-target interactions using protein sequences and drug topological structures
    Cao, Dong-Sheng
    Liu, Shao
    Xu, Qing-Song
    Lu, Hong-Mei
    Huang, Jian-Hua
    Hu, Qian-Nan
    Liang, Yi-Zeng
    ANALYTICA CHIMICA ACTA, 2012, 752 : 1 - 10
  • [7] Matrix factorization with denoising autoencoders for prediction of drug-target interactions
    Sajadi, Seyedeh Zahra
    Zare Chahooki, Mohammad Ali
    Tavakol, Maryam
    Gharaghani, Sajjad
    MOLECULAR DIVERSITY, 2023, 27 (03) : 1333 - 1343
  • [8] Network-Based Methods for Prediction of Drug-Target Interactions
    Wu, Zengrui
    Li, Weihua
    Liu, Guixia
    Tang, Yun
    FRONTIERS IN PHARMACOLOGY, 2018, 9
  • [9] Prediction of drug-target interactions from literature by context similarity
    Plake, C.
    Schroeder, M.
    NEW BIOTECHNOLOGY, 2010, 27 : S21 - S21
  • [10] Application of Machine Learning Techniques in Drug-target Interactions Prediction
    Zhang, Shengli
    Wang, Jiesheng
    Lin, Zhenhui
    Liang, Yunyun
    CURRENT PHARMACEUTICAL DESIGN, 2021, 27 (17) : 2076 - 2087