Validation of Deep Learning-Based DFCNN in Extremely Large-Scale Virtual Screening and Application in Trypsin I Protease Inhibitor Discovery

被引:2
|
作者
Zhang, Haiping [1 ]
Lin, Xiao [2 ]
Wei, Yanjie [1 ]
Zhang, Huiling [1 ]
Liao, Linbu [3 ]
Wu, Hao [1 ]
Pan, Yi [1 ]
Wu, Xuli [2 ]
机构
[1] Chinese Acad Sci, Shenzhen Inst Adv Technol, Ctr High Performance Comp, Joint Engn Res Ctr Hlth Big Data Intelligent Anal, Shenzhen, Peoples R China
[2] Shenzhen Univ, Sch Med, Shenzhen, Peoples R China
[3] Zhejiang Univ, Coll Software Technol, Hangzhou, Peoples R China
基金
美国国家科学基金会;
关键词
extremely large-scale virtual screening; deep learning; DFCNN; Trypsin I Protease; de novo drug screening; DRUG DISCOVERY; DOCKING; SOFTWARE;
D O I
10.3389/fmolb.2022.872086
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Computational methods with affordable computational resources are highly desirable for identifying active drug leads from millions of compounds. This requires a model that is both highly efficient and relatively accurate, which cannot be achieved by most of the current methods. In real virtual screening (VS) application scenarios, the desired method should perform much better in selecting active compounds by prediction than by random chance. Here, we systematically evaluate the performance of our previously developed DFCNN model in large-scale virtual screening, and the results show our method has approximately 22 times the success rate compared to the random chance on average with a score cutoff of 0.99. Of the 102 test cases, 10 cases have more than 98 times the success rate of a random guess. Interestingly, in three cases, the prediction success rate is 99 times that of a random guess by a score cutoff of 0.99. This indicates that in most situations after our extremely large-scale VS, the dataset can be reduced 20 to 100 times for the next step of virtual screening based on docking or MD simulation. Furthermore, we have employed an experimental method to verify our computational method by finding several activity inhibitors for Trypsin I Protease. In addition, we also show its proof-of-concept application in de novo drug screening. The results indicate the massive potential of this method in the first step of the real drug development workflow. Moreover, DFCNN only takes about 0.0000225s for one protein-compound prediction on average with 80 Intel CPU cores (2.00 GHz) and 60 GB RAM, which is at least tens of thousands of times faster than AutoDock Vina or Schrodinger high-throughput virtual screening. Additionally, an online webserver based on DFCNN for large-scale screening is available at for the convenience of the users.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Deep Learning-Based Classification and Reconstruction of Residential Scenes From Large-Scale Point Clouds
    Zhang, Liqiang
    Zhang, Liang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2018, 56 (04): : 1887 - 1897
  • [22] Deep learning-based transient stability assessment framework for large-scale modern power system
    Li, Xin
    Liu, Chenkai
    Guo, Panfeng
    Liu, Shengchi
    Ning, Jing
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 139
  • [23] Deep Learning-based Highlighting Visualization for Soft Edges in Large-Scale Scanned Point Clouds
    Li, Weite
    Hasegawa, Kyoko
    Li, Liang
    Tsukamoto, Akihiro
    Yamaguchi, Hiroshi
    Thufail, Fadjar, I
    Brahmantara
    Tanaka, Satoshi
    JOURNAL OF ADVANCED SIMULATION IN SCIENCE AND ENGINEERING, 2022, 9 (02): : 278 - 288
  • [24] Deep learning-based large-scale named entity recognition for anatomical region of mammalian brain
    Chai, Xiaokang
    Di, Yachao
    Feng, Zhao
    Guan, Yue
    Zhang, Guoqing
    Li, Anan
    Luo, Qingming
    QUANTITATIVE BIOLOGY, 2022, 10 (03) : 253 - 263
  • [25] Deep learning-based coagulant dosage prediction for extreme events leveraging large-scale data
    Kim, Jiwoong
    Hua, Chuanbo
    Lin, Subin
    Kang, Seoktae
    Kang, Joo-Hyon
    Park, Mi-Hyun
    JOURNAL OF WATER PROCESS ENGINEERING, 2024, 66
  • [26] A Deep Learning-Based Cluster Analysis Method for Large-Scale Multi-Label Images
    Xu, Yanping
    TRAITEMENT DU SIGNAL, 2022, 39 (03) : 931 - 937
  • [27] Deep learning-based transient stability assessment framework for large-scale modern power system
    Li, Xin
    Liu, Chenkai
    Guo, Panfeng
    Liu, Shengchi
    Ning, Jing
    International Journal of Electrical Power and Energy Systems, 2022, 139
  • [28] A deep learning-based digital twin model for the temperature field of large-scale battery systems
    Shen, Kai
    Ling, Yujia
    Meng, Xiangqi
    Lai, Xin
    Zhu, Zhicheng
    Sun, Tao
    Li, Dawei
    Zheng, Yuejiu
    Wang, Huaibin
    Xu, Chengshan
    Feng, Xuning
    JOURNAL OF ENERGY STORAGE, 2025, 113
  • [29] Deep learning-based extraction of building contours for large-scale 3D urban reconstruction
    Tripodi, S.
    Duan, L.
    Trastour, F.
    Poujad, V.
    Laurore, L.
    Tarabalka, Y.
    IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XXV, 2019, 11155
  • [30] Deep learning-based cargo recognition and classification method for automated loading process in large-scale logistics
    Kim S.-M.
    Lee S.-D.
    Choi J.A.
    Lee K.-B.
    Transactions of the Korean Institute of Electrical Engineers, 2024, 73 (01): : 192 - 200