A machine learning model for estimating daily maximum 8-hour average ozone concentrations using OMI and MODIS products

被引:0
|
作者
Jung, Chau-Ren [1 ,2 ]
Chen, Wei [3 ]
Chen, Wei-Ting [4 ]
Su, Shih-Hao [5 ]
Chen, Bo-Ting [3 ]
Chang, Ling [3 ]
Hwang, Bing-Fang [3 ,6 ]
机构
[1] Department of Public Health, China Medical University, Taiwan
[2] Japan Environment and Children's Study Programme Office, National Institute for Environmental Studies, Tsukuba, Japan
[3] Department of Occupational Safety and Health, College of Public Health, China Medical University, Taichung, Taiwan
[4] Department of Atmospheric Sciences, National Taiwan University, Taipei, Taiwan
[5] Department of Atmospheric Sciences, Chinese Culture University, Taipei, Taiwan
[6] Department of Occupational Therapy, College of Medical and Health Science, Asia University, Taichung, Taiwan
基金
美国国家航空航天局;
关键词
Aerosols - Air pollution - Boundary layer flow - Boundary layers - Climate change - Forestry - Land use - Machine learning - Mean square error - Nitrogen oxides - Remote sensing;
D O I
10.1016/j.atmosenv.2024.120587
中图分类号
学科分类号
摘要
Tropospheric ozone (O3) is a criteria air pollutants posing risks to organisms, and is expected to enhance formation due to climate change. Satellite-based measurements provide a promising approach to estimate ground-level air pollution on large scale. However, most applications of satellite-based measurements have been used for fine particulate matter and nitrogen dioxide, while only a few have been used for O3. In this study, we incorporated satellite-based measurements from the Ozone Monitoring Instrument (OMI) and MOderate-resolution Imaging Spectroradiometer (MODIS) with meteorological variables and land-use data to estimate daily maximum 8-h average O3 at 1-km resolution in Taiwan during 2004–2020. The random forest model was used to impute the missing values of the satellite-based measurements. Additionally, the XGBoost model was leveraged to estimate daily O3 concentrations. Model performance was evaluated by the ten-fold cross-validation (CV), temporal and spatial validation, and the results were reported as the coefficient of determination (R2) and root mean square error (RMSE). Our results showed that the 10-fold CV, temporal validated, and spatial validated R2 (RMSE) of the XGBoost model were 0.82 (7.71 ppb), 0.63 (11.09 ppb), and 0.68 (10.27 ppb), respectively. Our model performance was better in central and southern Taiwan. The top ten important predictors were date (relative importance = 12.15%), temperature (10.77%), meridional wind (10.71%), relative humidity (9.60%), zonal wind (8.14%), UV radiation (8.07%), total precipitation (6.35%), surface pressure (5.34%), surface O3 volume mixing ratio (4.93%), and boundary layer height (4.69%). The spatial distribution of O3 estimates showed that daily maximum 8-h average O3 concentrations were higher in the suburban and mountainous areas near the central and southern Taiwan. This reveals that sensitive populations should still pay attention to the secondary pollutants even when outside the urban areas. The O3 estimates can be further leveraged to evaluate the short-term and long-term effects of O3 on human health. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 27 条
  • [21] Estimating ground-level PM2.5 concentrations by developing and optimizing machine learning and statistical models using 3 km MODIS AODs: case study of Tehran, Iran
    Saeed Sotoudeheian
    Mohammad Arhami
    Journal of Environmental Health Science and Engineering, 2021, 19 : 1 - 21
  • [22] Estimating hourly PM2.5 concentrations using Himawari-8 AOD and a DBSCAN-modified deep learning model over the YRDUA, China
    Lu, Xiaoman
    Wang, Jiajia
    Yan, Yingting
    Zhou, Liguo
    Ma, Weichun
    ATMOSPHERIC POLLUTION RESEARCH, 2021, 12 (02) : 183 - 192
  • [23] Spatiotemporal estimation of 6-hour high-resolution precipitation across China based on Himawari-8 using a stacking ensemble machine learning model
    Zhou, Siqin
    Wang, Yuan
    Yuan, Qiangqiang
    Yue, Linwei
    Zhang, Liangpei
    JOURNAL OF HYDROLOGY, 2022, 609
  • [24] Simple prediction accuracy for covid-19 daily incident data in Malaysia using machine learning- Auto Regressive Integrated Moving Average (ARIMA), and Linear Regression (LR) Model
    Nawi, Mohamad Arif Awang
    Lazin, Muhamamd Amirul Mat
    MEDICAL SCIENCE, 2022, 26 (125)
  • [25] Creating 1-km long-term (1980-2014) daily average air temperatures over the Tibetan Plateau by integrating eight types of reanalysis and land data assimilation products downscaled with MODIS-estimated temperature lapse rates based on machine learning
    Zhang, Hongbo
    Immerzeel, W. W.
    Zhang, Fan
    de Kok, Remco J.
    Gorrie, Sally J.
    Ye, Ming
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2021, 97
  • [26] Spatial Gap-Filling of Himawari-8 Hourly AOD Products Using Machine Learning with Model-Based AOD and Meteorological Data: A Focus on the Korean Peninsula
    Youn, Youjeong
    Kim, Seoyeon
    Kim, Seung Hee
    Lee, Yangwon
    Remote Sensing, 2024, 16 (23)
  • [27] Multiple agricultural cropland products of South Asia developed using Landsat-8 30 m and MODIS 250 m data using machine learning on the Google Earth Engine (GEE) cloud and spectral matching techniques (SMTs) in support of food and water security
    Gumma, Murali Krishna
    Thenkabail, Prasad S.
    Panjala, Pranay
    Teluguntla, Pardhasaradhi
    Yamano, Takashi
    Mohammed, Ismail
    GISCIENCE & REMOTE SENSING, 2022, 59 (01) : 1048 - 1077