Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery

被引:2
|
作者
Zhang, Haiping [1 ]
Lin, Xiao [2 ]
Wei, Yanjie [1 ]
Zhang, Huiling [1 ]
Liao, Linbu [3 ]
Wu, Hao [1 ]
Pan, Yi [1 ]
Wu, Xuli [2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Ctr High Performance Comp, Joint Engn Res Ctr Hlth Big Data Intelligent Anal, Shenzhen, Peoples R China
[2] Shenzhen Univ, Sch Med, Shenzhen, Peoples R China
[3] Zhejiang Univ, Coll Software Technol, Hangzhou, Peoples R China
基金
美国国家科学基金会;
关键词
extremely large-scale virtual screening; deep learning; DFCNN; Trypsin I Protease; de novo drug screening; DRUG DISCOVERY; DOCKING; SOFTWARE;
D O I
10.3389/fmolb.2022.872086
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Computational methods with affordable computational resources are highly desirable for identifying active drug leads from millions of compounds. This requires a model that is both highly efficient and relatively accurate, which cannot be achieved by most of the current methods. In real virtual screening (VS) application scenarios, the desired method should perform much better in selecting active compounds by prediction than by random chance. Here, we systematically evaluate the performance of our previously developed DFCNN model in large-scale virtual screening, and the results show our method has approximately 22 times the success rate compared to the random chance on average with a score cutoff of 0.99. Of the 102 test cases, 10 cases have more than 98 times the success rate of a random guess. Interestingly, in three cases, the prediction success rate is 99 times that of a random guess by a score cutoff of 0.99. This indicates that in most situations after our extremely large-scale VS, the dataset can be reduced 20 to 100 times for the next step of virtual screening based on docking or MD simulation. Furthermore, we have employed an experimental method to verify our computational method by finding several activity inhibitors for Trypsin I Protease. In addition, we also show its proof-of-concept application in de novo drug screening. The results indicate the massive potential of this method in the first step of the real drug development workflow. Moreover, DFCNN only takes about 0.0000225s for one protein-compound prediction on average with 80 Intel CPU cores (2.00 GHz) and 60 GB RAM, which is at least tens of thousands of times faster than AutoDock Vina or Schrodinger high-throughput virtual screening. Additionally, an online webserver based on DFCNN for large-scale screening is available at for the convenience of the users.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Leveraging cloud computing for large-scale QM calculations: Application to virtual screening and structure-based design
    Rai, Brajesh
    Sresht, Vishnu
    Yang, Qingyi
    Unwalla, Ray
    Tu, Meihua
    Mathiowetz, Alan
    Bakken, Gregory
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2018, 255
  • [42] Learning-Based Reflection-Aware Virtual Point Removal for Large-Scale 3D Point Clouds
    Lee, Oggyu
    Joo, Kyungdon
    Sim, Jae-Young
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (12) : 8510 - 8517
  • [43] Deep Learning-Based construction of a Drug-Like compound database and its application in virtual screening of HsDHODH inhibitors
    Xia, Wei
    Xiao, Jin
    Bian, Hengwei
    Zhang, Jiajun
    Zhang, John Z. H.
    Zhang, Haiping
    METHODS, 2024, 225 : 44 - 51
  • [44] Discovery of novel human histamine H4 receptor ligands by large-scale structure-based virtual screening
    Kiss, Robert
    Kiss, Bela
    Koenczoel, Arpad
    Szalai, Ferenc
    Jelinek, Ivett
    Laszlo, Valeria
    Noszal, Bela
    Falus, Andras
    Keseru, Gyoergy M.
    JOURNAL OF MEDICINAL CHEMISTRY, 2008, 51 (11) : 3145 - 3153
  • [45] Free energy perturbation-based large-scale virtual screening for effective drug discovery against COVID-19
    Li, Zhe
    Wu, Chengkun
    Li, Yishui
    Liu, Runduo
    Lu, Kai
    Wang, Ruibo
    Liu, Jie
    Gong, Chunye
    Yang, Canqun
    Wang, Xin
    Zhan, Chang-Guo
    Luo, Hai-Bin
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2023, 37 (01): : 45 - 57
  • [46] Accelerating Computational Materials Discovery with Machine Learning and Cloud High-Performance Computing: from Large-Scale Screening to Experimental Validation
    Chen, Chi
    Nguyen, Dan Thien
    Lee, Shannon J.
    Baker, Nathan A.
    Karakoti, Ajay S.
    Lauw, Linda
    Owen, Craig
    Mueller, Karl T.
    Bilodeau, Brian A.
    Murugesan, Vijayakumar
    Troyer, Matthias
    JOURNAL OF THE AMERICAN CHEMICAL SOCIETY, 2024, 146 (29) : 20009 - 20018
  • [47] Deep Learning-Based Localization and Perception Systems: Approaches for Autonomous Cargo Transportation Vehicles in Large-Scale, Semiclosed Environments
    Liu, Zhe
    Suo, Chuanzhe
    Liu, Yingtian
    Shen, Yueling
    Qiao, Zhijian
    Wei, Huanshu
    Zhou, Shunbo
    Li, Haoang
    Liang, Xinwu
    Wang, Hesheng
    Liu, Yun-Hui
    IEEE ROBOTICS & AUTOMATION MAGAZINE, 2020, 27 (02) : 139 - 150
  • [48] Automatic Segmentation of Asphalt Cracks on Highways After Large-Scale and Severe Earthquakes Using Deep Learning-Based Approaches
    Yilmaz, Mehmet
    Yalcin, Erkut
    Demir, Fatih
    Ozdemir, Ahmet Munir
    Atar, Muhammed
    Gunes, Aysegul
    Yalcin, Beyza Furtana
    Cambay, Ertugrul
    IEEE ACCESS, 2025, 13 : 22820 - 22830
  • [49] Large-scale genomic survey with deep learning-based method reveals strain-level phage specificity determinants
    Yang, Yiyan
    Dufault-Thompson, Keith
    Yan, Wei
    Cai, Tian
    Xie, Lei
    Jiang, Xiaofang
    GIGASCIENCE, 2024, 13
  • [50] DeepThal: A Deep Learning-Based Framework for the Large-Scale Prediction of the α+-Thalassemia Trait Using Red Blood Cell Parameters
    Phirom, Krittaya
    Charoenkwan, Phasit
    Shoombuatong, Watshara
    Charoenkwan, Pimlak
    Sirichotiyakul, Supatra
    Tongsong, Theera
    JOURNAL OF CLINICAL MEDICINE, 2022, 11 (21)