Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery

被引:2
|
作者
Zhang, Haiping [1 ]
Lin, Xiao [2 ]
Wei, Yanjie [1 ]
Zhang, Huiling [1 ]
Liao, Linbu [3 ]
Wu, Hao [1 ]
Pan, Yi [1 ]
Wu, Xuli [2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Ctr High Performance Comp, Joint Engn Res Ctr Hlth Big Data Intelligent Anal, Shenzhen, Peoples R China
[2] Shenzhen Univ, Sch Med, Shenzhen, Peoples R China
[3] Zhejiang Univ, Coll Software Technol, Hangzhou, Peoples R China
基金
美国国家科学基金会;
关键词
extremely large-scale virtual screening; deep learning; DFCNN; Trypsin I Protease; de novo drug screening; DRUG DISCOVERY; DOCKING; SOFTWARE;
D O I
10.3389/fmolb.2022.872086
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Computational methods with affordable computational resources are highly desirable for identifying active drug leads from millions of compounds. This requires a model that is both highly efficient and relatively accurate, which cannot be achieved by most of the current methods. In real virtual screening (VS) application scenarios, the desired method should perform much better in selecting active compounds by prediction than by random chance. Here, we systematically evaluate the performance of our previously developed DFCNN model in large-scale virtual screening, and the results show our method has approximately 22 times the success rate compared to the random chance on average with a score cutoff of 0.99. Of the 102 test cases, 10 cases have more than 98 times the success rate of a random guess. Interestingly, in three cases, the prediction success rate is 99 times that of a random guess by a score cutoff of 0.99. This indicates that in most situations after our extremely large-scale VS, the dataset can be reduced 20 to 100 times for the next step of virtual screening based on docking or MD simulation. Furthermore, we have employed an experimental method to verify our computational method by finding several activity inhibitors for Trypsin I Protease. In addition, we also show its proof-of-concept application in de novo drug screening. The results indicate the massive potential of this method in the first step of the real drug development workflow. Moreover, DFCNN only takes about 0.0000225s for one protein-compound prediction on average with 80 Intel CPU cores (2.00 GHz) and 60 GB RAM, which is at least tens of thousands of times faster than AutoDock Vina or Schrodinger high-throughput virtual screening. Additionally, an online webserver based on DFCNN for large-scale screening is available at for the convenience of the users.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] THE INFLUENCE OF CHANGING FEATURES ON THE ACCURACY OF DEEP LEARNING-BASED LARGE-SCALE OUTDOOR LIDAR SEMANTIC SEGMENTATION
    Liu, Chang
    Zhang, Qi
    Shirowzhan, Sara
    Bai, Ting
    Sheng, Ziheng
    Wu, Yunhao
    Kuang, Jianming
    Ge, Linlin
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 4443 - 4446
  • [32] Deep learning-based pulmonary tuberculosis automated detection on chest radiography: large-scale independent testing
    Zhou, Wen
    Cheng, Guanxun
    Zhang, Ziqi
    Zhu, Litong
    Jaeger, Stefan
    Lure, Fleming Y. M.
    Guo, Lin
    QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2022, 12 (04) : 2344 - 2355
  • [33] Deep Learning Based Beam Training for Extremely Large-Scale Massive MIMO in Near-Field Domain
    Liu, Wang
    Ren, Hong
    Pan, Cunhua
    Wang, Jiangzhou
    IEEE COMMUNICATIONS LETTERS, 2023, 27 (01) : 170 - 174
  • [34] Automating Rey Complex Figure Test scoring using a deep learning-based approach: a potential large-scale screening tool for cognitive decline
    Jun Young Park
    Eun Hyun Seo
    Hyung-Jun Yoon
    Sungho Won
    Kun Ho Lee
    Alzheimer's Research & Therapy, 15
  • [35] Automating Rey Complex Figure Test scoring using a deep learning-based approach: a potential large-scale screening tool for cognitive decline
    Park, Jun Young
    Seo, Eun Hyun
    Yoon, Hyung-Jun
    Won, Sungho
    Lee, Kun Ho
    ALZHEIMERS RESEARCH & THERAPY, 2023, 15 (01)
  • [36] Large-Scale Virtual Screening for the Discovery of SARS-CoV-2 Papain-like Protease (PLpro) Non-covalent Inhibitors
    Garland, Olivia
    Ton, Anh-Tien
    Moradi, Shoeib
    Smith, Jason R.
    Kovacic, Suzana
    Ng, Kurtis
    Pandey, Mohit
    Ban, Fuqiang
    Lee, Jaeyong
    Vuckovic, Marija
    Worrall, Liam J.
    Young, Robert N.
    Pantophlet, Ralph
    Strynadka, Natalie C. J.
    Cherkasov, Artem
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 63 (07) : 2158 - 2169
  • [37] Human mitochondrial protein complexes revealed by large-scale coevolution analysis and deep learning-based structure modeling
    Pei, Jimin
    Zhang, Jing
    Cong, Qian
    BIOINFORMATICS, 2022, 38 (18) : 4301 - 4311
  • [38] A large-scale multi-view deep learning-based assessment of left ventricular ejection fraction in echocardiography
    Jing, Linyuan
    Long, Aaron
    vanMaanen, David
    Rocha, Daniel
    Hartzel, Dustin
    Kelsey, Christopher
    Ruhl, Jeffrey
    Beecy, Ashley
    Elnabawi, Youssef
    Metser, Gil
    Mawson, Thomas
    Tat, Emily
    Jiang, Nona
    Duffy, Eamon
    Hahn, Rebecca
    Homma, Shunichi
    CIRCULATION, 2024, 150
  • [39] FadeNet: Deep Learning-Based mm-Wave Large-Scale Channel Fading Prediction and its Applications
    Ratnam, Vishnu V.
    Chen, Hao
    Pawar, Sameer
    Zhang, Bingwen
    Zhang, Charlie Jianzhong
    Kim, Young-Jin
    Lee, Soonyoung
    Cho, Minsung
    Yoon, Sung-Rok
    IEEE ACCESS, 2021, 9 : 3278 - 3290
  • [40] Large-scale chemical process causal discovery from big data with transformer-based deep learning
    Bi, Xiaotian
    Wu, Deyang
    Xie, Daoxiong
    Ye, Huawei
    Zhao, Jinsong
    PROCESS SAFETY AND ENVIRONMENTAL PROTECTION, 2023, 173 : 163 - 177