Traditional Machine and Deep Learning for Predicting Toxicity Endpoints

被引:3
|
作者
Norinder, Ulf [1 ]
机构
[1] Stockholm Univ, Dept Comp & Syst Sci, S-16407 Kista, Sweden
来源
MOLECULES | 2023年 / 28卷 / 01期
关键词
CATMoS dataset; CDDD; BERT; conformal prediction; random forest; RDKit; LANGUAGE;
D O I
10.3390/molecules28010217
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Molecular structure property modeling is an increasingly important tool for predicting compounds with desired properties due to the expensive and resource-intensive nature and the problem of toxicity-related attrition in late phases during drug discovery and development. Lately, the interest for applying deep learning techniques has increased considerably. This investigation compares the traditional physico-chemical descriptor and machine learning-based approaches through autoencoder generated descriptors to two different descriptor-free, Simplified Molecular Input Line Entry System (SMILES) based, deep learning architectures of Bidirectional Encoder Representations from Transformers (BERT) type using the Mondrian aggregated conformal prediction method as overarching framework. The results show for the binary CATMoS non-toxic and very-toxic datasets that for the former, almost equally balanced, dataset all methods perform equally well while for the latter dataset, with an 11-fold difference between the two classes, the MolBERT model based on a large pre-trained network performs somewhat better compared to the rest with high efficiency for both classes (0.93-0.94) as well as high values for sensitivity, specificity and balanced accuracy (0.86-0.87). The descriptor-free, SMILES-based, deep learning BERT architectures seem capable of producing well-balanced predictive models with defined applicability domains. This work also demonstrates that the class imbalance problem is gracefully handled through the use of Mondrian conformal prediction without the use of over- and/or under-sampling, weighting of classes or cost-sensitive methods.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Predicting Renal Toxicity of Compounds with Deep Learning and Machine Learning Methods
    Bitopan Mazumdar
    Pankaj Kumar Deva Sarma
    Hridoy Jyoti Mahanta
    [J]. SN Computer Science, 4 (6)
  • [2] Predicting groundwater level using traditional and deep machine learning algorithms
    Feng, Fan
    Ghorbani, Hamzeh
    Radwan, Ahmed E.
    [J]. FRONTIERS IN ENVIRONMENTAL SCIENCE, 2024, 12
  • [3] hERG-toxicity prediction using traditional machine learning and advanced deep learning techniques
    Ylipaa, Erik
    Chavan, Swapnil
    Bankestad, Maria
    Broberg, Johan
    Glinghammar, Bjorn
    Norinder, Ulf
    Cotgreave, Ian
    [J]. CURRENT RESEARCH IN TOXICOLOGY, 2023, 5
  • [4] Predicting toxicity by quantum machine learning
    Suzuki, Teppei
    Katouda, Michio
    [J]. JOURNAL OF PHYSICS COMMUNICATIONS, 2020, 4 (12):
  • [5] Is Predicting Software Security Bugs using Deep Learning Better than the Traditional Machine Learning Algorithms?
    Clemente, Caesar Jude
    Jaafar, Fehmi
    Malik, Yasir
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE QUALITY, RELIABILITY AND SECURITY (QRS 2018), 2018, : 95 - 102
  • [6] Advanced Mass-Spectra-Based Machine Learning for Predicting the Toxicity of Traditional Chinese Medicines
    Jia, Chen
    Li, Xiaofang
    Hu, Song
    Liu, Guohong
    Fang, Jiansong
    Zhou, Xiaoxia
    Yan, Xiliang
    Yan, Bing
    [J]. Analytical Chemistry, 2025, 97 (01) : 783 - 792
  • [7] Predicting Toxicity Properties through Machine Learning
    Adriana Borrero, Luz
    Sanchez Guette, Lilibeth
    Lopez, Enrique
    Bonerge Pineda, Omar
    Buelvas Castro, Edgardo
    [J]. 11TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 3RD INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2020, 170 : 1011 - 1016
  • [8] Predicting E-commerce customer satisfaction: Traditional machine learning vs. deep learning approaches
    Zaghloul, Maha
    Barakat, Sherif
    Rezk, Amira
    [J]. JOURNAL OF RETAILING AND CONSUMER SERVICES, 2024, 79
  • [9] Predicting the maximum lateral load of reinforced concrete columns with traditional machine learning, deep learning, and structural analysis software
    Canbay, Pelin
    Avgin, Sila
    Kose, Mehmet M.
    [J]. COMPUTERS AND CONCRETE, 2024, 33 (03): : 285 - 299
  • [10] Gaining insight into toxicity predicting machine learning algorithms
    Allen, T. E. H.
    Gelzinyte, E.
    Wedlake, A. J.
    Goodman, J. M.
    Gutsell, S.
    Russell, P. J.
    [J]. TOXICOLOGY LETTERS, 2019, 314 : S280 - S280