Cross-column density functional theory-based quantitative structure-retention relationship model development powered by machine learning

被引:3
|
作者
Mazraedoost, Sargol [1 ]
Zuvela, Petar [1 ]
Ulenberg, Szymon [2 ]
Baczek, Tomasz [2 ]
Liu, J. Jay [1 ,3 ]
机构
[1] Pukyong Natl Univ, Dept Chem Engn, Intelligent Syst Lab, Busan 48513, South Korea
[2] Med Univ Gdansk, Dept Pharmaceut Chem, Gen J Hallera 107, PL-80416 Gdansk, Poland
[3] Pukyong Natl Univ, Inst Cleaner Prod Technol, 45 Yongso Ro, Busan 48513, South Korea
基金
新加坡国家研究基金会;
关键词
Reversed-phase high-performance liquid chromatography (RP-HPLC); Retention time prediction; Quantitative structure-retention relationship (QSRR); Machine learning (ML); Density functional theory (DFT); Cheminformatics; LEAST-SQUARES REGRESSION; COMPOUND CLASSIFICATION; STATIONARY PHASES; TIME PREDICTION; USEFUL TOOL; HPLC; ENSEMBLE; HARDNESS;
D O I
10.1007/s00216-024-05243-7
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Quantitative structure-retention relationship (QSRR) modeling has emerged as an efficient alternative to predict analyte retention times using molecular descriptors. However, most reported QSRR models are column-specific, requiring separate models for each high-performance liquid chromatography (HPLC) system. This study evaluates the potential of machine learning (ML) algorithms and quantum mechanical (QM) descriptors to develop QSRR models that can predict retention times across three different reversed-phase HPLC columns under varying conditions. Four machine learning methods-partial least squares (PLS) regression, ridge regression (RR), random forest (RF), and gradient boosting (GB)-were compared on a dataset of 360 retention times for 15 aromatic analytes. Molecular descriptors were calculated using density functional theory (DFT). Column characteristics like particle size and pore size and experimental conditions like temperature and gradient time were additionally used as descriptors. Results showed that the GB-QSRR model demonstrated the best predictive performance, with Q2 of 0.989 and root mean square error of prediction (RMSEP) of 0.749 min on the test set. Feature analysis revealed that solvation energy (SE), HOMO-LUMO energy gap ( increment E HOMO-LUMO), total dipole moment (Mtot), and global hardness (eta) are among the most influential predictors for retention time prediction, indicating the significance of electrostatic interactions and hydrophobicity. Our findings underscore the efficiency of ensemble methods, GB and RF models employing non-linear learners, in capturing local variations in retention times across diverse experimental setups. This study emphasizes the potential of cross-column QSRR modeling and highlights the utility of ML models in optimizing chromatographic analysis.
引用
收藏
页码:2951 / 2968
页数:18
相关论文
共 32 条
  • [1] Cross-column density functional theory–based quantitative structure-retention relationship model development powered by machine learning
    Sargol Mazraedoost
    Petar Žuvela
    Szymon Ulenberg
    Tomasz Bączek
    J. Jay Liu
    Analytical and Bioanalytical Chemistry, 2024, 416 : 2951 - 2968
  • [2] Mechanistic Chromatographic Column Characterization for the Analysis of Flavonoids Using Quantitative Structure-Retention Relationships Based on Density Functional Theory
    Buszewski, Boguslaw
    Zuvela, Petar
    Sagandykova, Gulyaim
    Walczak-Skierska, Justyna
    Pomastowski, Pawel
    David, Jonathan
    Wong, Ming Wah
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (06)
  • [3] Machine learning-based quantitative structure-retention relationship models for predicting the retention indices of volatile organic pollutants
    Sepehri, B.
    Ghavami, R.
    Farahbakhsh, S.
    Ahmadi, R.
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL SCIENCE AND TECHNOLOGY, 2022, 19 (03) : 1457 - 1466
  • [4] Gradient Retention Time Modeling in Ion Chromatography through Ensemble Machine Learning-Powered Quantitative Structure-Retention Relationships
    Lim, Zhen Jia
    Zuvela, Petar
    Ukic, Sime
    Stankov, Mirjana Novak
    Bolanca, Tomislav
    Lovric, Mario
    Wong, Ming Wah
    Buszewski, Boguslaw
    ACS OMEGA, 2025, 10 (06): : 5993 - 6002
  • [5] Density Functional Theory-based Quantitative Structure Activity Relationship (QSAR) Study of Alkanol and Alkanthiol Derivatives
    Mehdipour, Ahmad Reza
    Safarpour, Mohammad Ali
    Taghavi, Fariba
    Jamali, Maryam
    QSAR & COMBINATORIAL SCIENCE, 2009, 28 (05): : 568 - 575
  • [6] Quantitative Structure-Retention Relationship Models for the Prediction of the Reversed-Phase HPLC Gradient Retention Based on the Heuristic Method and Support Vector Machine
    Du, Hongying
    Wang, Jie
    Yao, Xiaojun
    Hu, Zhide
    JOURNAL OF CHROMATOGRAPHIC SCIENCE, 2009, 47 (05) : 396 - 404
  • [7] Chemical feature-based machine learning model for predicting photophysical properties of BODIPY compounds: density functional theory and quantitative structure-property relationship modeling
    Casanola-Martin, Gerardo M.
    Wang, Jing
    Zhou, Jian-ge
    Rasulev, Bakhtiyor
    Leszczynski, Jerzy
    JOURNAL OF MOLECULAR MODELING, 2025, 31 (01)
  • [8] Machine learning-based quantitative structure–retention relationship models for predicting the retention indices of volatile organic pollutants
    B. Sepehri
    R. Ghavami
    S. Farahbakhsh
    R. Ahmadi
    International Journal of Environmental Science and Technology, 2022, 19 : 1457 - 1466
  • [9] Comparative Analysis of pK a Predictions for Arsonic Acids Using Density Functional Theory-Based and Machine Learning Approaches
    Nedyalkova, Miroslava
    Heredia, Diana
    Barroso-Flores, Joaquin
    Lattuada, Marco
    ACS OMEGA, 2025, 10 (03): : 3128 - 3140
  • [10] Kernel-Based, Partial Least Squares Quantitative Structure-Retention Relationship Model for UPLC Retention Time Prediction: A Useful Tool for Metabolite Identification
    Falchi, Federico
    Bertozzi, Sine Mandrup
    Ottonello, Giuliana
    Ruda, Gian Filippo
    Colombano, Giampiero
    Fiorelli, Claudio
    Martucci, Cataldo
    Bertorelli, Rosalia
    Scarpelli, Rita
    Cavalli, Andrea
    Bandiera, Tiziano
    Armirotti, Andrea
    ANALYTICAL CHEMISTRY, 2016, 88 (19) : 9510 - 9517