Exploring Dimensionality Reduction Techniques for Deep Learning Driven QSAR Models of Mutagenicity

被引:1
|
作者
Kalian, Alexander D. [1 ]
Benfenati, Emilio [2 ]
Osborne, Olivia J. [3 ]
Gott, David [3 ]
Potter, Claire [3 ]
Dorne, Jean-Lou C. M. [4 ]
Guo, Miao [5 ]
Hogstrand, Christer [6 ]
机构
[1] Kings Coll London, Dept Nutr Sci, Franklin Wilkins Bldg,150 Stamford St, London SE1 9NH, England
[2] Ist Ric Farmacolog Mario Negri IRCCS, Via Mario Negri 2, I-20156 Milan, Italy
[3] Food Stand Agcy, 70 Petty France, London SW1H 9EX, England
[4] European Food Safety Author EFSA, Via Carlo Magno 1A, I-43126 Parma, Italy
[5] Kings Coll London, Dept Engn, Strand Campus, London WC2R 2LS, England
[6] Kings Coll London, Dept Analyt Environm & Forens Sci, Franklin Wilkins Bldg,150 Stamford St, London SE1 9NH, England
基金
英国生物技术与生命科学研究理事会;
关键词
QSAR; dimensionality reduction; deep learning; autoencoder; principal component analysis; locally linear embedding; grid search; hyperparameter optimisation; mutagenicity; cheminformatics;
D O I
10.3390/toxics11070572
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Dimensionality reduction techniques are crucial for enabling deep learning driven quantitative structure-activity relationship (QSAR) models to navigate higher dimensional toxicological spaces, however the use of specific techniques is often arbitrary and poorly explored. Six dimensionality techniques (both linear and non-linear) were hence applied to a higher dimensionality mutagenicity dataset and compared in their ability to power a simple deep learning driven QSAR model, following grid searches for optimal hyperparameter values. It was found that comparatively simpler linear techniques, such as principal component analysis (PCA), were sufficient for enabling optimal QSAR model performances, which indicated that the original dataset was at least approximately linearly separable (in accordance with Cover's theorem). However certain non-linear techniques such as kernel PCA and autoencoders performed at closely comparable levels, while (especially in the case of autoencoders) being more widely applicable to potentially non-linearly separable datasets. Analysis of the chemical space, in terms of XLogP and molecular weight, uncovered that the vast majority of testing data occurred within the defined applicability domain, as well as that certain regions were measurably more problematic and antagonised performances. It was however indicated that certain dimensionality reduction techniques were able to facilitate uniquely beneficial navigations of the chemical space.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] Dimensionality reduction for deep learning in infrared microscopy: a comparative computational survey
    Mueller, Dajana
    Schuhmacher, David
    Schoerner, Stephanie
    Grosserueschkamp, Frederik
    Tischoff, Iris
    Tannapfel, Andrea
    Reinacher-Schick, Anke
    Gerwert, Klaus
    Mosig, Axel
    [J]. ANALYST, 2023, 148 (20) : 5022 - 5032
  • [33] Deep learning approach based on dimensionality reduction for designing electromagnetic nanostructures
    Kiarashinejad, Yashar
    Abdollahramezani, Sajjad
    Adibi, Ali
    [J]. NPJ COMPUTATIONAL MATERIALS, 2020, 6 (01)
  • [34] Feature Dimensionality Reduction with Variational Autoencoders in Deep Bayesian Active Learning
    Col, Pinar Ezgi
    Ertekin, Seyda
    [J]. 29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [35] An Evaluation of Dimensionality Reduction and Classification Techniques for Cardiac Disease Diagnosis from ECG Signals with Various Deep Learning Classifiers
    Karthikeyani, S.
    Sasipriya, S.
    Ramkumar, M.
    [J]. CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024,
  • [36] Data-driven models for accurate estimation of fuel consumption using Deep Learning techniques
    Gracia-Berna, Antonio
    Vega-Astorga, Ruben
    del Pozo-Dominguez, Maria
    Lopez-Leones, Javier
    [J]. 2023 IEEE/AIAA 42ND DIGITAL AVIONICS SYSTEMS CONFERENCE, DASC, 2023,
  • [37] Knowledge Extraction from XCSR Based on Dimensionality Reduction and Deep Generative Models
    Tadokoro, Masakazu
    Hasegawa, Satoshi
    Tatsumi, Takato
    Sato, Hiroyuki
    Takadama, Keiki
    [J]. 2019 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2019, : 1883 - 1890
  • [38] Interpretable dimensionality reduction of single cell transcriptome data with deep generative models
    Ding, Jiarui
    Condon, Anne
    Shah, Sohrab P.
    [J]. NATURE COMMUNICATIONS, 2018, 9
  • [39] Interpretable dimensionality reduction of single cell transcriptome data with deep generative models
    Jiarui Ding
    Anne Condon
    Sohrab P. Shah
    [J]. Nature Communications, 9
  • [40] Analysis of dimensionality reduction techniques on Internet of Things data using machine learning
    Rashid, Lubaba
    Rubab, Saddaf
    Alhaisoni, Majed
    Alqahtani, Abdullah
    Alsubai, Shtwai
    Binbusayyis, Adel
    Bukhari, Syed Ahmad Chan
    [J]. SUSTAINABLE ENERGY TECHNOLOGIES AND ASSESSMENTS, 2022, 52