Improved Machine Learning Models by Data Processing for Predicting Life-Cycle Environmental Impacts of Chemicals

被引:14
|
作者
You, Shijie [1 ]
Sun, Ye [1 ]
Wang, Xiuheng [1 ]
Ren, Nanqi [1 ]
Liu, Yanbiao [2 ]
机构
[1] Harbin Inst Technol, Sch Environm, State Key Lab Urban Water Resource & Environm, Harbin 150090, Peoples R China
[2] Donghua Univ, Coll Environm Sci & Engn, Text Pollut Controlling Engn Ctr Minist Ecol & Env, Shanghai 201620, Peoples R China
基金
中国国家自然科学基金;
关键词
life cycle assessment (LCA); machine learning; data processing; feature selection; weighted Euclidean distance; FEATURE-SELECTION; NEURAL-NETWORK;
D O I
10.1021/acs.est.2c04945
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Machine learning (ML) provides an efficient manner for rapid prediction of the life-cycle environmental impacts of chemicals, but challenges remain due to low prediction accuracy and poor interpretability of the models. To address these issues, we focused on data processing by using a mutual information-permutation importance (MI-PI) feature selection method to filter out irrelevant molecular descriptors from the input data, which improved the model interpretability by preserving the physicochemical meanings of original molecular descriptors without generation of new variables. We also applied a weighted Euclidean distance method to mine the data most relevant to the predicted targets by quantifying the contribution of each feature, thereby the prediction accuracy was improved. On the basis of above data processing, we developed artificial neural network (ANN) models for predicting the life-cycle environmental impacts of chemicals with R2 values of 0.81, 0.81, 0.84, 0.75, 0.73, and 0.86 for global warming, human health, metal depletion, freshwater ecotoxicity, particulate matter formation, and terrestrial acidification, respectively. The ML models were interpreted using the Shapley additive explanation method by quantifying the contribution of each input molecular descriptor to environmental impact categories. This work suggests that the combination of feature selection by MI-PI and source data selection based on weighted Euclidean distance has a promising potential to improve the accuracy and interpretability of the models for predicting the life-cycle environmental impacts of chemicals.
引用
收藏
页码:3434 / 3444
页数:11
相关论文
共 50 条
  • [21] Sensitivity analysis of design variables in life-cycle environmental impacts of buildings
    Zhou, Yijun
    Tam, Vivian WY.
    Le, Khoa N.
    JOURNAL OF BUILDING ENGINEERING, 2023, 65
  • [22] Environmental input-output models for life-cycle analysis
    Pan, XM
    Kraines, S
    ENVIRONMENTAL & RESOURCE ECONOMICS, 2001, 20 (01): : 61 - 72
  • [23] Environmental Input-Output Models for Life-Cycle Analysis
    Xiaoming Pan
    Steven Kraines
    Environmental and Resource Economics, 2001, 20 : 61 - 72
  • [24] Environmental impacts of lithium production showing the importance of primary data of upstream process in life-cycle assessment
    Jiang, Songyan
    Zhang, Ling
    Li, Fengying
    Hua, Hui
    Liu, Xin
    Yuan, Zengwei
    Wu, Huijun
    JOURNAL OF ENVIRONMENTAL MANAGEMENT, 2020, 262 (262)
  • [25] Cost Analysis and Life-Cycle Environmental Impacts of Three Value-Added Novel Bioproducts: Processing and Production
    Emmanuel K. Yiridoe
    Qiaojie Chen
    Rodney Fry
    Derek Lynch
    Gordon Price
    Natural Resources Research, 2015, 24 : 65 - 84
  • [26] Cost Analysis and Life-Cycle Environmental Impacts of Three Value-Added Novel Bioproducts: Processing and Production
    Yiridoe, Emmanuel K.
    Chen, Qiaojie
    Fry, Rodney
    Lynch, Derek
    Price, Gordon
    NATURAL RESOURCES RESEARCH, 2015, 24 (01) : 65 - 84
  • [27] Improved Machine Learning Models for Predicting Selective Compounds
    Ning, Xia
    Walters, Michael
    Karypisxr, George
    JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2012, 52 (01) : 38 - 50
  • [28] Towards improved environmental indicators for mining using life-cycle thinking
    van Zyl, DJ
    LIFE-CYCLE ASSESSMENT OF METALS: ISSUES AND RESEARCH DIRECTIONS, 2003, : 117 - 122
  • [29] Analysis of the life-cycle costs and environmental impacts of cooking fuels used in Ghana
    Afrane, George
    Ntiamoah, Augustine
    APPLIED ENERGY, 2012, 98 : 301 - 306
  • [30] Examining the life-cycle environmental impacts of desalination: A case study in the State of Qatar
    Mannan, Mehzabeen
    Alhaj, Mohamed
    Mabrouk, Abdel Nasser
    Al-Ghamdi, Sami G.
    DESALINATION, 2019, 452 : 238 - 246