In silico prediction of toxicity of phenols to Tetrahymena pyriformis by using genetic algorithm and decision tree-based modeling approach

被引:23
|
作者
Abbasitabar, Fatemeh [1 ]
Zare-Shahabadi, Vahid [2 ]
机构
[1] Islamic Azad Univ, Dept Chem, Marvdasht Branch, Marvdasht, Iran
[2] Islamic Azad Univ, Dept Chem, Mahshahr Branch, Mahshahr, Iran
关键词
Toxicity; Phenol; Decision tree; Genetic algorithm; Tetrahymena pyriformis; MINNOW PIMEPHALES-PROMELAS; STRUCTURE-PROPERTY RELATIONSHIP; MULTIPLE LINEAR REGRESSIONS; ACUTE AQUATIC TOXICITY; QUANTITATIVE STRUCTURE; FATHEAD MINNOW; QSAR MODELS; MOLECULAR-STRUCTURE; ORGANIC-COMPOUNDS; SAR MODELS;
D O I
10.1016/j.chemosphere.2016.12.095
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Risk assessment of chemicals is an important issue in environmental protection; however, there is a huge lack of experimental data for a large number of end-points. The experimental determination of toxicity of chemicals involves high costs and time-consuming process. In silica tools such as quantitative structure toxicity relationship (QSTR) models, which are constructed on the basis of computational molecular descriptors, can predict missing data for toxic end-points for existing or even not yet synthesized chemicals. Phenol derivatives are known to be aquatic pollutants. With this background, we aimed to develop an accurate and reliable QSTR model for the prediction of toxicity of 206 phenols to Tetrahymena pyriformis. A multiple linear regression (MLR)-based QSTR was obtained using a powerful descriptor selection tool named Memorized_ACO algorithm. Statistical parameters of the model were 0.72 and 0.68 for R-training(2) and R-test(2), respectively. To develop a high-quality QSTR model, classification and regression raining tree (CART) was employed. Two approaches were considered; (1) phenols were classified into different modes of action using CART and (2) the phenols in the training set were partitioned to several subsets by a tree in such a manner that in each subset, a high-quality MLR could be developed. For the first approach, the statistical parameters of the resultant QSTR model were improved to 0.83 and 0.75 for R-training(2) and R-test(2), respectively. Genetic algorithm was employed in the second approach to obtain an optimal tree, and it was shown that the final QSTR model provided excellent prediction accuracy for the training and test sets (R-training(2) and R-test(2) were 0.91 and 0.93, respectively). The mean absolute error for the test set was computed as 0.1615. (C) 2016 Elsevier Ltd. All rights reserved.
引用
收藏
页码:249 / 259
页数:11
相关论文
共 50 条
  • [31] Classification Prediction of PM10 Concentration Using a Tree-Based Machine Learning Approach
    Shaziayani, Wan Nur
    Ul-Saufie, Ahmad Zia
    Mutalib, Sofianita
    Noor, Norazian Mohamad
    Zainordin, Nazatul Syadia
    [J]. ATMOSPHERE, 2022, 13 (04)
  • [32] Prediction of acute toxicity of phenol derivatives using multiple linear regression approach for Tetrahymena pyriformis contaminant identification in a median-size database
    Dieguez-Santana, Karel
    Hai Pham-The
    Villegas-Aguilar, Pedro J.
    Huong Le-Thi-Thu
    Castillo-Garit, Juan A.
    Casanola-Martin, Gerardo M.
    [J]. CHEMOSPHERE, 2016, 165 : 434 - 441
  • [33] Prediction model using SMOTE, genetic algorithm and decision tree (PMSGD) for classification of diabetes mellitus
    Chandrashekhar Azad
    Bharat Bhushan
    Rohit Sharma
    Achyut Shankar
    Krishna Kant Singh
    Aditya Khamparia
    [J]. Multimedia Systems, 2022, 28 : 1289 - 1307
  • [34] Prediction model using SMOTE, genetic algorithm and decision tree (PMSGD) for classification of diabetes mellitus
    Azad, Chandrashekhar
    Bhushan, Bharat
    Sharma, Rohit
    Shankar, Achyut
    Singh, Krishna Kant
    Khamparia, Aditya
    [J]. MULTIMEDIA SYSTEMS, 2022, 28 (04) : 1289 - 1307
  • [35] Optimization of Tree-Based Machine Learning Models to Predict the Length of Hospital Stay Using Genetic Algorithm
    Mansoori, Atefeh
    Zeinalnezhad, Masoomeh
    Nazarimanesh, Leila
    [J]. Journal of Healthcare Engineering, 2023, 2023
  • [36] Effective TDMA scheduling for tree-based data collection using genetic algorithm in wireless sensor networks
    Osamy, Walid
    El-Sawy, Ahmed A.
    Khedr, Ahmed M.
    [J]. PEER-TO-PEER NETWORKING AND APPLICATIONS, 2020, 13 (03) : 796 - 815
  • [37] Effective TDMA scheduling for tree-based data collection using genetic algorithm in wireless sensor networks
    Walid Osamy
    Ahmed A. El-Sawy
    Ahmed M. Khedr
    [J]. Peer-to-Peer Networking and Applications, 2020, 13 : 796 - 815
  • [38] Solving exclusionary side constrained transportation problem by using a hybrid spanning tree-based genetic algorithm
    Admi Syarif
    Mitsuo Gen
    [J]. Journal of Intelligent Manufacturing, 2003, 14 : 389 - 399
  • [39] Addressing a nonlinear fixed-charge transportation problem using a spanning tree-based genetic algorithm
    Hajiaghaei-Keshteli, M.
    Molla-Alizadeh-Zavardehi, S.
    Tavakkoli-Moghaddam, R.
    [J]. COMPUTERS & INDUSTRIAL ENGINEERING, 2010, 59 (02) : 259 - 271
  • [40] Solving exclusionary side constrained transportation problem by using a hybrid spanning tree-based genetic algorithm
    Syarif, A
    Gen, M
    [J]. JOURNAL OF INTELLIGENT MANUFACTURING, 2003, 14 (3-4) : 389 - 399