Improved Machine Learning Models by Data Processing for Predicting Life-Cycle Environmental Impacts of Chemicals

被引:18
|
作者
You, Shijie [1 ]
Sun, Ye [1 ]
Wang, Xiuheng [1 ]
Ren, Nanqi [1 ]
Liu, Yanbiao [2 ]
机构
[1] Harbin Inst Technol, Sch Environm, State Key Lab Urban Water Resource & Environm, Harbin 150090, Peoples R China
[2] Donghua Univ, Coll Environm Sci & Engn, Text Pollut Controlling Engn Ctr Minist Ecol & Env, Shanghai 201620, Peoples R China
基金
中国国家自然科学基金;
关键词
life cycle assessment (LCA); machine learning; data processing; feature selection; weighted Euclidean distance; FEATURE-SELECTION; NEURAL-NETWORK;
D O I
10.1021/acs.est.2c04945
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Machine learning (ML) provides an efficient manner for rapid prediction of the life-cycle environmental impacts of chemicals, but challenges remain due to low prediction accuracy and poor interpretability of the models. To address these issues, we focused on data processing by using a mutual information-permutation importance (MI-PI) feature selection method to filter out irrelevant molecular descriptors from the input data, which improved the model interpretability by preserving the physicochemical meanings of original molecular descriptors without generation of new variables. We also applied a weighted Euclidean distance method to mine the data most relevant to the predicted targets by quantifying the contribution of each feature, thereby the prediction accuracy was improved. On the basis of above data processing, we developed artificial neural network (ANN) models for predicting the life-cycle environmental impacts of chemicals with R2 values of 0.81, 0.81, 0.84, 0.75, 0.73, and 0.86 for global warming, human health, metal depletion, freshwater ecotoxicity, particulate matter formation, and terrestrial acidification, respectively. The ML models were interpreted using the Shapley additive explanation method by quantifying the contribution of each input molecular descriptor to environmental impact categories. This work suggests that the combination of feature selection by MI-PI and source data selection based on weighted Euclidean distance has a promising potential to improve the accuracy and interpretability of the models for predicting the life-cycle environmental impacts of chemicals.
引用
收藏
页码:3434 / 3444
页数:11
相关论文
共 50 条
  • [41] Considerations in assessing environmental impacts of essential metals in life-cycle impact analysis
    van Tilborg, W
    Van Assche, F
    Cook, M
    LIFE-CYCLE ASSESSMENT OF METALS: ISSUES AND RESEARCH DIRECTIONS, 2003, : 220 - 223
  • [42] Economic input-output models for environmental life-cycle assessment
    Hendrickson, Chris
    Horvath, Arpad
    Joshi, Satish
    Lave, Lester
    Environmental Science and Technology, 1998, 32 (07):
  • [43] Economic input-output models for environmental life-cycle assessment
    Hendrickson, C
    Horvath, A
    Joshi, S
    Lave, L
    ENVIRONMENTAL SCIENCE & TECHNOLOGY, 1998, 32 (07) : 184A - 191A
  • [44] Machine Learning Applications in Facility Life-Cycle Cost Analysis: A Review
    Gao, Xinghua
    Pishdad-Bozorgi, Pardis
    Shelden, Dennis R.
    Hu, Yuqing
    COMPUTING IN CIVIL ENGINEERING 2019: SMART CITIES, SUSTAINABILITY, AND RESILIENCE, 2019, : 267 - 274
  • [45] Unlocking the potential of AI: Machine learning and deep learning models for predicting carcinogenicity of chemicals
    Guo, Wenjing
    Liu, Jie
    Dong, Fan
    Hong, Huixiao
    JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH PART C-TOXICOLOGY AND CARCINOGENESIS, 2025, 43 (01): : 23 - 50
  • [46] Parametric analysis of railway infrastructure for improved performance and lower life-cycle costs using machine learning techniques
    Sainz-Aja, Jose A.
    Ferreno, Diego
    Pombo, Joao
    Carrascal, Isidro A.
    Casado, Jose
    Diego, Soraya
    Castro, Jorge
    ADVANCES IN ENGINEERING SOFTWARE, 2023, 175
  • [47] Environmental impacts of commuting modes in Lisbon: A life-cycle assessment addressing particulate matter impacts on health
    Bastos, Joana
    Marques, Pedro
    Batterman, Stuart A.
    Freire, Fausto
    INTERNATIONAL JOURNAL OF SUSTAINABLE TRANSPORTATION, 2019, 13 (09) : 652 - 663
  • [48] Predicting Personality with Twitter Data and Machine Learning Models
    Ergu, Izel
    Isik, Zerrin
    Yankayis, Ismail
    2019 INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS CONFERENCE (ASYU), 2019, : 386 - 390
  • [49] Development of a Life-Cycle Assessment Tool to Quantify the Environmental Impacts of Airport Pavement Construction
    Yang, Rebekah
    Al-Qadi, Imad L.
    TRANSPORTATION RESEARCH RECORD, 2017, (2603) : 89 - 97
  • [50] Evaluating the environmental impacts of the water-energy-food nexus with a life-cycle approachY
    Li, Pei-Chiun
    Ma, Hwong-wen
    RESOURCES CONSERVATION AND RECYCLING, 2020, 157