Predictive Modeling of Pesticides Reproductive Toxicity in Earthworms Using Interpretable Machine-Learning Techniques on Imbalanced Data

被引:0
|
作者
Kotli, Mihkel [1 ]
Piir, Geven [1 ]
Maran, Uko [1 ]
机构
[1] Univ Tartu, Inst Chem, EE-50411 Tartu, Estonia
来源
ACS OMEGA | 2025年 / 10卷 / 05期
基金
欧盟地平线“2020”;
关键词
QSAR MODELS; CHEMICALS; SORPTION;
D O I
10.1021/acsomega.4c09719
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
The earthworm is a key indicator species in soil ecosystems. This makes the reproductive toxicity of chemical compounds to earthworms a desired property of determination and makes computational models necessary for descriptive and predictive purposes. Thus, the aim was to develop an advanced Quantitative Structure-Activity Relationship modeling approach for this complex property with imbalanced data. The approach integrated gradient-boosted decision trees as classifiers with a genetic algorithm for feature selection and Bayesian optimization for hyperparameter tuning. An additional goal was to analyze and interpret, using SHAP values, the structural features encoded by the molecular descriptors that contribute to pesticide toxicity and nontoxicity, the most notable of which are solvation entropy and a number of hydrolyzable bonds. The final model was constructed as a stacked ensemble of models and combined the strengths of the individual models. Evaluation of this model with an external test set of 147 compounds demonstrated a well-defined applicability domain and sufficient predictive capabilities with a Balanced Accuracy of 77%. The model representation follows FAIR principles and is available on QsarDB.org.
引用
收藏
页码:4732 / 4744
页数:13
相关论文
共 50 条
  • [11] A comparative analysis of machine learning techniques for imbalanced data
    Mrad, Ali Ben
    Lahiani, Amine
    Mefteh-Wali, Salma
    Mselmi, Nada
    ANNALS OF OPERATIONS RESEARCH, 2024,
  • [12] DVFS Binning Using Machine-Learning Techniques
    Chang, Keng-Wei
    Huang, Chun-Yang
    Mu, Szu-Pang
    Huang, Jian-Min
    Chen, Shi-Hao
    Chao, Mango C-T
    2018 IEEE INTERNATIONAL TEST CONFERENCE IN ASIA (ITC-ASIA 2018), 2018, : 31 - 36
  • [13] Modeling pulsed laser micromachining of micro geometries using machine-learning techniques
    Teixidor, D.
    Grzenda, M.
    Bustillo, A.
    Ciurana, J.
    JOURNAL OF INTELLIGENT MANUFACTURING, 2015, 26 (04) : 801 - 814
  • [14] Modeling pulsed laser micromachining of micro geometries using machine-learning techniques
    D. Teixidor
    M. Grzenda
    A. Bustillo
    J. Ciurana
    Journal of Intelligent Manufacturing, 2015, 26 : 801 - 814
  • [15] Predictive modeling for peri-implantitis by using machine learning techniques
    Mameno, Tomoaki
    Wada, Masahiro
    Nozaki, Kazunori
    Takahashi, Toshihito
    Tsujioka, Yoshitaka
    Akema, Suzuna
    Hasegawa, Daisuke
    Ikebe, Kazunori
    SCIENTIFIC REPORTS, 2021, 11 (01)
  • [16] Predictive modeling for peri-implantitis by using machine learning techniques
    Tomoaki Mameno
    Masahiro Wada
    Kazunori Nozaki
    Toshihito Takahashi
    Yoshitaka Tsujioka
    Suzuna Akema
    Daisuke Hasegawa
    Kazunori Ikebe
    Scientific Reports, 11
  • [17] Modeling the Effect of Streetscape Environment on Crime Using Street View Images and Interpretable Machine-Learning Technique
    Xie, Huafang
    Liu, Lin
    Yue, Han
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (21)
  • [18] Interpretable machine-learning models for estimating trip purpose in smart card data
    Kim, Eui-Jin
    Kim, Youngseo
    Kim, Dong-Kyu
    PROCEEDINGS OF THE INSTITUTION OF CIVIL ENGINEERS-MUNICIPAL ENGINEER, 2021, 174 (02) : 108 - 117
  • [19] An interpretable ensemble machine-learning workflow for permeability predictions in tight sandstone reservoirs using logging data
    Feng, Ping
    Wang, Ruijia
    Sun, Jianmeng
    Yan, Weichao
    Chi, Peng
    Luo, Xin
    GEOPHYSICS, 2024, 89 (05) : MR265 - MR280
  • [20] Predictive Analytics of Sensor Data Using Distributed Machine Learning Techniques
    Kejela, Girma
    Esteves, Rui Maximo
    Rong, Chunming
    2014 IEEE 6TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2014, : 626 - 631