Cross-column density functional theory-based quantitative structure-retention relationship model development powered by machine learning

被引:3
|
作者
Mazraedoost, Sargol [1 ]
Zuvela, Petar [1 ]
Ulenberg, Szymon [2 ]
Baczek, Tomasz [2 ]
Liu, J. Jay [1 ,3 ]
机构
[1] Pukyong Natl Univ, Dept Chem Engn, Intelligent Syst Lab, Busan 48513, South Korea
[2] Med Univ Gdansk, Dept Pharmaceut Chem, Gen J Hallera 107, PL-80416 Gdansk, Poland
[3] Pukyong Natl Univ, Inst Cleaner Prod Technol, 45 Yongso Ro, Busan 48513, South Korea
基金
新加坡国家研究基金会;
关键词
Reversed-phase high-performance liquid chromatography (RP-HPLC); Retention time prediction; Quantitative structure-retention relationship (QSRR); Machine learning (ML); Density functional theory (DFT); Cheminformatics; LEAST-SQUARES REGRESSION; COMPOUND CLASSIFICATION; STATIONARY PHASES; TIME PREDICTION; USEFUL TOOL; HPLC; ENSEMBLE; HARDNESS;
D O I
10.1007/s00216-024-05243-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Quantitative structure-retention relationship (QSRR) modeling has emerged as an efficient alternative to predict analyte retention times using molecular descriptors. However, most reported QSRR models are column-specific, requiring separate models for each high-performance liquid chromatography (HPLC) system. This study evaluates the potential of machine learning (ML) algorithms and quantum mechanical (QM) descriptors to develop QSRR models that can predict retention times across three different reversed-phase HPLC columns under varying conditions. Four machine learning methods-partial least squares (PLS) regression, ridge regression (RR), random forest (RF), and gradient boosting (GB)-were compared on a dataset of 360 retention times for 15 aromatic analytes. Molecular descriptors were calculated using density functional theory (DFT). Column characteristics like particle size and pore size and experimental conditions like temperature and gradient time were additionally used as descriptors. Results showed that the GB-QSRR model demonstrated the best predictive performance, with Q2 of 0.989 and root mean square error of prediction (RMSEP) of 0.749 min on the test set. Feature analysis revealed that solvation energy (SE), HOMO-LUMO energy gap ( increment E HOMO-LUMO), total dipole moment (Mtot), and global hardness (eta) are among the most influential predictors for retention time prediction, indicating the significance of electrostatic interactions and hydrophobicity. Our findings underscore the efficiency of ensemble methods, GB and RF models employing non-linear learners, in capturing local variations in retention times across diverse experimental setups. This study emphasizes the potential of cross-column QSRR modeling and highlights the utility of ML models in optimizing chromatographic analysis.
引用
收藏
页码:2951 / 2968
页数:18
相关论文
共 32 条
  • [21] Uranyl Affinity between Uranyl Cation and Different Kinds of Monovalent Anions: Density Functional Theory and Quantitative Structure-Property Relationship Model
    Xu, Xiang
    Jiang, Haiyan
    Wu, Kechen
    JOURNAL OF PHYSICAL CHEMISTRY A, 2024, 128 (15): : 2960 - 2970
  • [22] Support Vector Models-Based Quantitative Structure-Retention Relationship (QSRR) in the Development and Validation of RP-HPLC Method for Multi-component Analysis of Anti-diabetic Drugs
    Rajput, Krishnapal
    Dhiman, Shubham
    Veni, N. Krishna
    Ravichandiran, V.
    Peraman, Ramalingam
    CHROMATOGRAPHIA, 2024, 87 (01) : 3 - 16
  • [23] Theoretical Investigations and Density Functional Theory Based Quantitative Structure-Activity Relationships Model for Novel Cytotoxic Platinum(IV) Complexes
    Varbanov, Hristo P.
    Jakupec, Michael A.
    Roller, Alexander
    Jensen, Frank
    Galanski, Markus
    Keppler, Bernhard K.
    JOURNAL OF MEDICINAL CHEMISTRY, 2013, 56 (01) : 330 - 344
  • [24] Prediction model of type and band gap for photocatalytic g-GaN-based van der Waals heterojunction of density functional theory and machine learning techniques
    Zhao, Ziyue
    Shen, Yang
    Zhu, Hua
    Zhang, Qihao
    Zhang, Yijun
    Yang, Xiaodong
    Liang, Pei
    Chen, Liang
    APPLIED SURFACE SCIENCE, 2023, 640
  • [25] Application of Machine Learning in the Development of Fourth Degree Quantitative Structure-Activity Relationship Model for Triclosan Analogs Tested against Plasmodium falciparum 3D7
    Guimaraes, Railton Marques de Souza
    Vieira, Ivo Henrique Provensi
    Zanchi, Fabricio Berton
    Caceres, Rafael Andrade
    Zanchi, Fernando Berton
    ACS OMEGA, 2024, 9 (44): : 44436 - 44447
  • [26] Engineering Design of Battery Module for Electric Vehicles: Comprehensive Framework Development Based on Density Functional Theory, Topology Optimization, Machine Learning, Multidisciplinary Design Optimization, and Digital Twins
    Ghosh, N.
    Garg, Akhil
    Li, Wei
    Gao, Liang
    Nguyen-Thoi, T.
    JOURNAL OF ELECTROCHEMICAL ENERGY CONVERSION AND STORAGE, 2022, 19 (03)
  • [27] Development of machine learning-based quantitative structure-activity relationship models for predicting plasma half-lives of drugs in six common food animal species
    Wu, Pei-Yu
    Chou, Wei-Chun
    Wu, Xue
    Kamineni, Venkata N.
    Kuchimanchi, Yashas
    Tell, Lisa A.
    Maunsell, Fiona P.
    Lin, Zhoumeng
    TOXICOLOGICAL SCIENCES, 2024, 203 (01) : 52 - 66
  • [28] Accurate Computational Prediction of Core-Electron Binding Energies in Carbon-Based Materials: A Machine-Learning Model Combining Density-Functional Theory and GW
    Golze, Dorothea
    Hirvensalo, Markus
    Hernandez-Leon, Patricia
    Aarva, Anja
    Etula, Jarkko
    Susi, Toma
    Rinke, Patrick
    Laurila, Tomi
    Caro, Miguel A.
    CHEMISTRY OF MATERIALS, 2022, 34 (14) : 6240 - 6254
  • [29] A Simple Machine Learning-Based Quantitative Structure-Activity Relationship Model for Predicting pIC50 Inhibition Values of FLT3 Tyrosine Kinase
    Alcazar, Jackson J.
    Sanchez, Ignacio
    Merino, Cristian
    Monasterio, Bruno
    Sajuria, Gaspar
    Miranda, Diego
    Diaz, Felipe
    Campodonico, Paola R.
    PHARMACEUTICALS, 2025, 18 (01)
  • [30] Accurate and Validated Quantitative Structure - Activity Relationship Model of Caspase-mediated Apoptosis-inducing Activity of Phenolic Compounds Using Density Functional Theory Calculation and Genetic Algorithm - Multiple Linear Regression
    Chang, Jin
    Lei, Beilei
    Li, Jiazhong
    Li, Shuyan
    Shen, Yulin
    Yao, Xiaojun
    QSAR & COMBINATORIAL SCIENCE, 2008, 27 (11-12): : 1318 - 1325