Evaluation of different machine learning approaches for predicting high concentration episodes of ground-level ozone: A case study in Catalonia, Spain

被引:6
|
作者
Vicente, D. J. [1 ]
Salazar, F. [1 ,2 ]
Lopez-Chacon, S. R. [1 ]
Soriano, C. [1 ]
Martin-Vide, J. [3 ]
机构
[1] Int Ctr Numer Methods Engn CIMNE, Barcelona 08034, Spain
[2] Univ Politecn Catalunya UPC, Flumen Res Inst, Barcelona 08034, Spain
[3] Univ Barcelona, Dept Geog, IdRA Climatol Grp, Barcelona, Spain
关键词
Ozone; Air pollution; Machine learning; High ozone episodes; Random forest; SUPPORT VECTOR MACHINE; SURFACE-OZONE; SPATIOTEMPORAL PREDICTION; CHINA; MODEL; CLASSIFICATION; POLLUTION;
D O I
10.1016/j.apr.2023.101999
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Ground-level ozone (O-3) is a pollutant with a great impact on human health and the environment. As a secondary air contaminant of photochemical origin, those areas with greater exposure to solar radiation, such as Spain and other Mediterranean countries, are considerably affected. With the aggravation of O-3 pollution, it is important to provide reliable forecasting tools to help stakeholders implement more effective policies to mitigate the negative impact associated with this problem. In this regard, Machine Learning-based models have emerged in recent years, since they are able to identify complex relationships between ozone levels and relevant variables. However, their application to capture the most extreme events remains difficult. In this work, different ML approaches for predicting daily maximum 8-h average ozone (O-3,O-MDA8) were compared, investigating their ability to forecast the highest concentration levels recorded. Two variants of the Random Forest algorithm (regression and classification) were applied to a specific area of Catalonia, Spain, with a special interest due to the high number of episodes of exceedance of O-3 concentration levels. The predictive models were built with a 1 day time horizon, using datasets from 2002 to 2020. The variables used as inputs were other air pollutants concentrations and meteorological processes, monitored the day before to the target day to be predicted, and time information. Although results showed reasonable overall performances, low accuracy was achieved when forecasting the highest episodes of O-3,O-MDA8. To improve the capacity of the models in predicting high-O-3,O-MDA8 concentration levels, a methodology was proposed to fine-tuning the original predictions of the ML models according to a classification metric, G-Mean, which allows adjusting the balance between the correct predictions of different classes. Using the Sensitivity and Specificity metrics, the classical approaches were compared with the original ones proposed in the present study. The results obtained, for all the cases analysed, showed a mean increase in Sensitivity of 0.28, associated with a greater number of True Positives (correct predictions of high O-3-episodes). On the other hand, the average Specificity value decreased, due to the appearance of a greater number of False Positives, although this reduction was only 0.05. The proposed criteria showed promising results, better balancing classification metrics and increasing the ratio of correct predictions linked to the higher ranges of O-3.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] High-spatial resolution ground-level ozone in Yunnan, China: A spatiotemporal estimation based on comparative analyses of machine learning models
    Man, Xingwei
    Liu, Rui
    Zhang, Yu
    Yu, Weiqiang
    Kong, Fanhao
    Liu, Li
    Luo, Yan
    Feng, Tao
    ENVIRONMENTAL RESEARCH, 2024, 251
  • [22] Research on satellite data-driven algorithm for ground-level ozone concentration inversion: case of Yunnan, China
    Yu, Weiqiang
    Feng, Tao
    Man, Xingwei
    Lin, Huan
    Zhang, Haonan
    Liu, Rui
    EARTH SCIENCE INFORMATICS, 2024, 17 (02) : 1053 - 1066
  • [23] Establishment of a structural equation model for ground-level ozone: a case study at an urban roadside site
    Lin, Kun-Ming
    Yu, Tai-Yi
    Chang, Len-Fu
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2014, 186 (12) : 8317 - 8328
  • [24] Research on satellite data-driven algorithm for ground-level ozone concentration inversion: case of Yunnan, China
    Weiqiang Yu
    Tao Feng
    Xingwei Man
    Huan Lin
    Haonan Zhang
    Rui Liu
    Earth Science Informatics, 2024, 17 : 1053 - 1066
  • [25] Establishment of a structural equation model for ground-level ozone: a case study at an urban roadside site
    Kun-Ming Lin
    Tai-Yi Yu
    Len-Fu Chang
    Environmental Monitoring and Assessment, 2014, 186 : 8317 - 8328
  • [26] Solving environmental problems with regional decision-making: A case study of ground-level ozone
    Dinan, T
    Tawil, N
    NATIONAL TAX JOURNAL, 2003, 56 (01) : 123 - 138
  • [27] An Ensemble Learning Approach for Estimating High Spatiotemporal Resolution of Ground-Level Ozone in the Contiguous United States
    Requia, Weeberb J.
    Di, Qian
    Silvern, Rachel
    Kelly, James T.
    Koutrakis, Petros
    Mickley, Loretta J.
    Sulprizio, Melissa P.
    Amini, Heresh
    Shi, Liuhua
    Schwartz, Joel
    ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2020, 54 (18) : 11037 - 11047
  • [28] A novel ensemble machine learning exposure model system for ground-level ozone at the national scale: A case of mainland China from 2013 to 2020
    Wang, Jiawei
    ENVIRONMENTAL IMPACT ASSESSMENT REVIEW, 2024, 109
  • [29] A Study on Statistical Data Mining Algorithms for the Prediction of Ground-Level Ozone Concentration in the El Paso–Juarez Area
    Md Al Masum Bhuiyan
    Suhail Mahmud
    Nusrat Sarmin
    Sanjida Elahee
    Aerosol Science and Engineering, 2020, 4 : 293 - 305
  • [30] Unraveling the Influence of Satellite-Observed Land Surface Temperature on High-Resolution Mapping of Ground-Level Ozone Using Interpretable Machine Learning
    He, Qingqing
    Cao, Jingru
    Saide, Pablo E.
    Ye, Tong
    Wang, Weihang
    ENVIRONMENTAL SCIENCE & TECHNOLOGY, 2024, 58 (36) : 15938 - 15948