A gridded air quality forecast through fusing site-available machine learning predictions from RFSML v1.0 and chemical transport model results from GEOS-Chem v13.1.0 using the ensemble Kalman filter

被引：2

作者：

Fang, Li ^{[1
]}

Jin, Jianbing ^{[1
]}

Segers, Arjo ^{[2
]}

Liao, Hong ^{[1
]}

Li, Ke ^{[1
]}

Xu, Bufan ^{[1
]}

Han, Wei ^{[3
]}

Pang, Mijie ^{[1
]}

Lin, Hai Xiang ^{[4
,5
]}

机构：

[1] Nanjing Univ Informat Sci & Technol, Jiangsu Collaborat Innovat Ctr Atmospher Environm, Sch Environm Sci & Engn,Joint Int Res Lab Climate, Jiangsu Key Lab Atmospher Environm Monitoring & Po, Nanjing, Jiangsu, Peoples R China

[2] TNO, Dept Climate Air & Sustainabil, Utrecht, Netherlands

[3] Chinese Meteorol Adm, Numer Weather Predict Ctr, Beijing, Peoples R China

[4] Leiden Univ, Inst Environm Sci, Leiden, Netherlands

[5] Delft Univ Technol, Delft Inst Appl Math, Delft, Netherlands

来源：

GEOSCIENTIFIC MODEL DEVELOPMENT | 2023年 / 16卷 / 16期

基金：

中国国家自然科学基金;

关键词：

ANTHROPOGENIC EMISSIONS; AEROSOLS; CHINA; INTERPOLATION; ASSIMILATION; V3.9.1.1; GASES;

D O I：

10.5194/gmd-16-4867-2023

中图分类号：

P [天文学、地球科学];

学科分类号：

07 ;

摘要：

Statistical methods, particularly machine learning models, have gained significant popularity in air quality predictions. These prediction models are commonly trained using the historical measurement datasets independently collected at the environmental monitoring stations and their operational forecasts in advance using inputs of the real-time ambient pollutant observations. Therefore, these high-quality machine learning models only provide site-available predictions and cannot solely be used as the operational forecast. In contrast, deterministic chemical transport models (CTMs), which simulate the full life cycles of air pollutants, provide predictions that are continuous in the 3D field. Despite their benefits, CTM predictions are typically biased, particularly on a fine scale, owing to the complex error sources due to the emission, transport, and removal of pollutants. In this study, we proposed a fusion of site-available machine learning prediction, which is from our regional feature selection-based machine learning model (RFSML v1.0), and a CTM prediction. Compared to the normal pure machine learning model, the fusion system provides a gridded prediction with relatively high accuracy. The prediction fusion was conducted using the Bayesian-theory-based ensemble Kalman filter (EnKF). Background error covariance was an essential part in the assimilation process. Ensemble CTM predictions driven by the perturbed emission inventories were initially used for representing their spatial covariance statistics, which could resolve the main part of the CTM error. In addition, a covariance inflation algorithm was designed to amplify the ensemble perturbations to account for other model errors next to the uncertainty in emission inputs. Model evaluation tests were conducted based on independent measurements. Our EnKF-based prediction fusion presented superior performance compared to the pure CTM. Moreover, covariance inflation further enhanced the fused prediction, particularly in cases of severe underestimation.

引用

页码：4867 / 4882

页数：16