Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery

被引:2
|
作者
Zhang, Haiping [1 ]
Lin, Xiao [2 ]
Wei, Yanjie [1 ]
Zhang, Huiling [1 ]
Liao, Linbu [3 ]
Wu, Hao [1 ]
Pan, Yi [1 ]
Wu, Xuli [2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Ctr High Performance Comp, Joint Engn Res Ctr Hlth Big Data Intelligent Anal, Shenzhen, Peoples R China
[2] Shenzhen Univ, Sch Med, Shenzhen, Peoples R China
[3] Zhejiang Univ, Coll Software Technol, Hangzhou, Peoples R China
基金
美国国家科学基金会;
关键词
extremely large-scale virtual screening; deep learning; DFCNN; Trypsin I Protease; de novo drug screening; DRUG DISCOVERY; DOCKING; SOFTWARE;
D O I
10.3389/fmolb.2022.872086
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Computational methods with affordable computational resources are highly desirable for identifying active drug leads from millions of compounds. This requires a model that is both highly efficient and relatively accurate, which cannot be achieved by most of the current methods. In real virtual screening (VS) application scenarios, the desired method should perform much better in selecting active compounds by prediction than by random chance. Here, we systematically evaluate the performance of our previously developed DFCNN model in large-scale virtual screening, and the results show our method has approximately 22 times the success rate compared to the random chance on average with a score cutoff of 0.99. Of the 102 test cases, 10 cases have more than 98 times the success rate of a random guess. Interestingly, in three cases, the prediction success rate is 99 times that of a random guess by a score cutoff of 0.99. This indicates that in most situations after our extremely large-scale VS, the dataset can be reduced 20 to 100 times for the next step of virtual screening based on docking or MD simulation. Furthermore, we have employed an experimental method to verify our computational method by finding several activity inhibitors for Trypsin I Protease. In addition, we also show its proof-of-concept application in de novo drug screening. The results indicate the massive potential of this method in the first step of the real drug development workflow. Moreover, DFCNN only takes about 0.0000225s for one protein-compound prediction on average with 80 Intel CPU cores (2.00 GHz) and 60 GB RAM, which is at least tens of thousands of times faster than AutoDock Vina or Schrodinger high-throughput virtual screening. Additionally, an online webserver based on DFCNN for large-scale screening is available at for the convenience of the users.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] DeepCPI:A Deep Learning-based Framework for Large-scale in silico Drug Screening
    Fangping Wan
    Yue Zhu
    Hailin Hu
    Antao Dai
    Xiaoqing Cai
    Ligong Chen
    Haipeng Gong
    Tian Xia
    Dehua Yang
    Ming-Wei Wang
    Jianyang Zeng
    Genomics,Proteomics & Bioinformatics, 2019, 17 (05) : 478 - 495
  • [2] DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening
    Wan, Fangping
    Zhu, Yue
    Hu, Hailin
    Dai, Antao
    Cai, Xiaoqing
    Chen, Ligong
    Gong, Haipeng
    Xia, Tian
    Yang, Dehua
    Wang, Ming-Wei
    Zeng, Jianyang
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2019, 17 (05) : 478 - 495
  • [3] DeepCPI:A Deep Learning-based Framework for Large-scale in silico Drug Screening
    Fangping Wan
    Yue Zhu
    Hailin Hu
    Antao Dai
    Xiaoqing Cai
    Ligong Chen
    Haipeng Gong
    Tian Xia
    Dehua Yang
    MingWei Wang
    Jianyang Zeng
    Genomics,Proteomics & Bioinformatics, 2019, (05) : 478 - 495
  • [4] Large-Scale Pretraining Improves Sample Efficiency of Active Learning-Based Virtual Screening
    Cao, Zhonglin
    Sciabola, Simone
    Wang, Ye
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2024, 64 (06) : 1882 - 1891
  • [5] Deep Reinforcement Learning-Based Large-Scale Robot Exploration
    Cao, Yuhong
    Zhao, Rui
    Wang, Yizhuo
    Xiang, Bairan
    Sartoretti, Guillaume
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (05) : 4631 - 4638
  • [6] Deep Learning-Based Large-Scale Automatic Satellite Crosswalk Classification
    Berriel, Rodrigo F.
    Lopes, Andre Teixeira
    de Souza, Alberto F.
    Oliveira-Santos, Thiago
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2017, 14 (09) : 1513 - 1517
  • [7] Deep Learning-Based Sentimental Analysis for Large-Scale Imbalanced Twitter Data
    Jamal, Nasir
    Chen, Xianqiao
    Aldabbas, Hamza
    FUTURE INTERNET, 2019, 11 (09)
  • [8] Deep Learning-Based Classification of Large-Scale Airborne LiDAR Point Cloud
    Turgeon-Pelchat, Mathieu
    Foucher, Samuel
    Bouroubi, Yacine
    CANADIAN JOURNAL OF REMOTE SENSING, 2021, 47 (03) : 381 - 395
  • [9] Discovery of novel A2AR antagonists through deep learning-based virtual screening
    Tang, Miru
    Wen, Chang
    Lin, Jie
    Chen, Hongming
    Ran, Ting
    ARTIFICIAL INTELLIGENCE IN THE LIFE SCIENCES, 2023, 3
  • [10] Attention deep learning-based large-scale learning classifier for Cassava leaf disease classification
    Ravi, Vinayakumar
    Acharya, Vasundhara
    Pham, Tuan D.
    EXPERT SYSTEMS, 2022, 39 (02)