Machine Learning-Based Hazard-Driven Prioritization of Features in Nontarget Screening of Environmental High-Resolution Mass Spectrometry Data

被引:18
|
作者
Arturi, Katarzyna [1 ]
Hollender, Juliane [1 ,2 ]
机构
[1] Swiss Fed Inst Aquat Sci & Technol Eawag, Dept Environm Chem, CH-8600 Dubendorf, Switzerland
[2] Eidgenoss TH Zurich ETH Zurich, Inst Biogeochem & Pollut Dynam, CH-8092 Zurich, Switzerland
关键词
ToxCast; Tox21; toxicity prediction; HRMS; MS; supervised classification; extreme gradientboosting; SIRIUS; IN-VITRO; PREDICTION; CHEMISTRY; TOXICITY; LIBRARY; MODELS; ASSAY;
D O I
10.1021/acs.est.3c00304
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
MLinvitroTox maps toxicologically relevantpollution inaquatic environments by predicting the toxicity of unidentified NTSHRMS/MS features from fragmentation spectra via machine learning. Nontarget high-resolution mass spectrometry screening(NTS HRMS/MS)can detect thousands of organic substances in environmental samples.However, new strategies are needed to focus time-intensive identificationefforts on features with the highest potential to cause adverse effectsinstead of the most abundant ones. To address this challenge, we developedMLinvitroTox, a machine learning framework that uses molecular fingerprintsderived from fragmentation spectra (MS2) for a rapid classificationof thousands of unidentified HRMS/MS features as toxic/nontoxic basedon nearly 400 target-specific and over 100 cytotoxic endpoints fromToxCast/Tox21. Model development results demonstrated that using customizedmolecular fingerprints and models, over a quarter of toxic endpointsand the majority of the associated mechanistic targets could be accuratelypredicted with sensitivities exceeding 0.95. Notably, SIRIUS molecularfingerprints and xboost (Extreme Gradient Boosting) models with SMOTE(Synthetic Minority Oversampling Technique) for handling data imbalancewere a universally successful and robust modeling configuration. Validationof MLinvitroTox on MassBank spectra showed that toxicity could bepredicted from molecular fingerprints derived from MS2 with an averagebalanced accuracy of 0.75. By applying MLinvitroTox to environmentalHRMS/MS data, we confirmed the experimental results obtained withtarget analysis and narrowed the analytical focus from tens of thousandsof detected signals to 783 features linked to potential toxicity,including 109 spectral matches and 30 compounds with confirmed toxicactivity.
引用
收藏
页码:18067 / 18079
页数:13
相关论文
共 50 条
  • [21] A machine learning-based approach for generating high-resolution soil moisture from SMAP products
    Zhang, Yueyuan
    Chen, Yangbo
    Chen, Lingfang
    Xu, Shichao
    Sun, Huaizhang
    [J]. GEOCARTO INTERNATIONAL, 2022, 37 (27) : 16086 - 16107
  • [22] Deep learning-based automated terrain classification using high-resolution DEM data
    Yang, Jiaqi
    Xu, Jun
    Lv, Yunshuo
    Zhou, Chenghu
    Zhu, Yunqiang
    Cheng, Weiming
    [J]. INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2023, 118
  • [23] Screening and identification of unknown chemical contaminants in food based on liquid chromatography-high-resolution mass spectrometry and machine learning
    Chen, Tiantian
    Liang, Wenying
    Zhang, Xiuqiong
    Wang, Yuting
    Lu, Xin
    Zhang, Yujie
    Zhang, Zhaohui
    You, Lei
    Liu, Xinyu
    Zhao, Chunxia
    Xu, Guowang
    [J]. ANALYTICA CHIMICA ACTA, 2024, 1287
  • [24] Improving Target and Suspect Screening High-Resolution Mass Spectrometry Workflows in Environmental Analysis by Ion Mobility Separation
    Celma, Alberto
    Sancho, Juan, V
    Schymanski, Emma L.
    Fabregat-Safont, David
    Ibanez, Maria
    Goshawk, Jeff
    Barknowitz, Gitte
    Hernandez, Felix
    Bijlsma, Lubertus
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2020, 54 (23) : 15120 - 15131
  • [25] Development and Implementation of Automated Qualification Processes for the Identification of Pollutants in an Aquatic Environment from High-Resolution Mass Spectrometric Nontarget Screening Data
    Lestremau, Francois
    Levesque, Alexandre
    Lahssini, Abdelmoughit
    de Bornier, Tanguy Magnan
    Laurans, Romain
    Assoumani, Azziz
    Biaudet, Hugues
    [J]. ACS ES&T WATER, 2023, 3 (03): : 765 - 772
  • [26] Assessing Emissions from Pharmaceutical Manufacturing Based on Temporal High-Resolution Mass Spectrometry Data
    Anliker, Sabine
    Loos, Martin
    Comte, Rahel
    Ruff, Matthias
    Fenner, Kathrin
    Singer, Heinz
    [J]. ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2020, 54 (07) : 4110 - 4120
  • [27] Practical application of in silico fragmentation based residue screening with ion mobility high-resolution mass spectrometry
    Kaufmann A.
    Butcher P.
    Maden K.
    Walker S.
    Widmer M.
    [J]. Rapid Communications in Mass Spectrometry, 2017, 31 (13) : 1147 - 1157
  • [28] Machine learning-based spatial downscaling and bias-correction framework for high-resolution temperature forecasting
    Meng, Xiangrui
    Zhao, Huan
    Shu, Ting
    Zhao, Junhua
    Wan, Qilin
    [J]. APPLIED INTELLIGENCE, 2024, 54 (17-18) : 8399 - 8414
  • [29] Target Analysis of Polychlorinated Naphthalenes and Nontarget Screening of Organic Chemicals in Bovine Milk, Infant Formula, and Adult Milk Powder by High-Resolution Mass Spectrometry
    Qi, Ziyuan
    Zhang, Zherui
    Jin, Rong
    Zhang, Lei
    Zheng, Minghui
    Li, Jingguang
    Wu, Yongning
    Li, Cheng
    Lin, Bingcheng
    Liu, Yahui
    Liu, Guorui
    [J]. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY, 2023, 72 (01) : 773 - 782
  • [30] High-resolution mass spectrometry (HRMS) methods for nontarget discovery and characterization of poly- and per-fluoroalkyl substances (PFASs) in environmental and human samples
    Liu, Yanna
    D'Agostino, Lisa A.
    Qu, Guangbo
    Jiang, Guibin
    Martin, Jonathan W.
    [J]. TRAC-TRENDS IN ANALYTICAL CHEMISTRY, 2019, 121