Enhancing Pollen Prediction in Beijing, a Chinese Megacity: Leveraging Ensemble Learning Models for Greater Accuracy

被引:0
|
作者
Ruan, Wenxi [1 ,2 ]
Li, Ziming [3 ]
Sun, Zhaobin [1 ,2 ]
An, Xingqin [1 ,2 ]
Zhao, Yuxin [1 ,2 ]
Zhang, Shuwen [4 ]
Liang, Yinglin [5 ]
Bu, Yaqin [6 ]
Xin, Jingyi [7 ]
Hang, Xiaoyi [7 ]
机构
[1] Chinese Acad Meteorol Sci, State Key Lab Severe Weather, Beijing 100081, Peoples R China
[2] Chinese Acad Meteorol Sci, Key Lab Atmospher Chem, CMA, Beijing 100081, Peoples R China
[3] Beijing Weather Forecast Ctr, Beijing 100089, Peoples R China
[4] Nanjing Univ Chinese Med, Coll Tradit Chinese Med, Nanjing 210023, Peoples R China
[5] Chengdu Univ Informat Technol, Sch Atmospher Sci, Chengdu 610225, Peoples R China
[6] Lanzhou Univ, Coll Earth & Environm Sci, Key Lab Western Chinas Environm Syst, Minist Educ, Lanzhou 730000, Peoples R China
[7] Beijing Univ Chinese Med, Sch Tradit Chinese Med, Beijing 100029, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine learning; Forecasting; Pollen concentrations; Lead time; Time series analysis; AIRBORNE POLLEN; CLIMATE-CHANGE; CORYLUS; ALNUS;
D O I
10.4209/aaqr.240123
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
In North China, pollen stands as a leading allergen responsible for allergic rhinitis, with climate change exacerbating allergenic pollen sensitization and posing significant health risks to residents. Despite its critical importance, pollen forecasting technology is still not sufficiently optimized. This study leverages multi-year daily pollen concentration observations and ECMWF (European Centre for Medium-Range Weather Forecasts) real-time forecast data, applying twelve machine learning models to learn perturbations separated from characteristic quantities. Specifically, it forecasts pollen concentrations in Beijing, utilizing R 2 and RMSE as evaluation metrics. The findings reveal that the CatBoost, Extra Trees, and XGBoost algorithms perform well for three-day consecutive pollen predictions. Specifically, when considering a one-day prediction period, the R 2 values for these algorithms are 0.72, 0.73, and 0.73, respectively. In contrast, algorithms such as Neural Network, LightGBM, and K-nearest Neighbor demonstrate weaker performance, though all models except Neural NetTorch achieve R 2 values above 0.50. Notably, the prediction accuracy of Neural NetTorch significantly improves with extended prediction time, with its R 2 increasing from 0.34 to 0.67 as the prediction period extends from one day to three days. The Weighted Ensemble model, which adjusts other models based on weighted optimization to mitigate excessive peaks, consistently yields stable results with an R 2 exceeding 0.67. Furthermore, the study assesses the importance of feature groups within the model, indicating that pollen emission intensity and phenological characteristics are crucial for both training and testing phases, whereas meteorological factors predominantly influence pollen dispersion. Given the strong impact of meteorological conditions and nonlinear regulation on pollen, a type of bioaerosol, machine learning demonstrates substantial potential for simulating and predicting its concentrations.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Enhancing dementia prediction models: Leveraging temporal patterns and class methods
    Seixas, Flavio Luiz
    Seixas, Elaine Rangel
    Freitas, Alex A.
    APPLIED SOFT COMPUTING, 2025, 171
  • [22] Deep learning models for enhancing potato leaf disease prediction: Implementation of transfer learning based stacking ensemble model
    Jha, Pradeep
    Dembla, Deepak
    Dubey, Widhi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (13) : 37839 - 37858
  • [23] Deep learning models for enhancing potato leaf disease prediction: Implementation of transfer learning based stacking ensemble model
    Jha, Pradeep
    Dembla, Deepak
    Dubey, Widhi
    Multimedia Tools and Applications, 2024, 83 (13) : 37839 - 37858
  • [24] Enhancing soil particle content prediction accuracy: advanced hyperspectral analysis and machine learning models
    Wang, Xiao
    Ding, Jianli
    Han, Lijing
    Tan, Jiao
    Ge, Xiangyu
    JOURNAL OF SOILS AND SEDIMENTS, 2024, 24 (10) : 3443 - 3458
  • [25] Deep learning models for enhancing potato leaf disease prediction: Implementation of transfer learning based stacking ensemble model
    Pradeep Jha
    Deepak Dembla
    Widhi Dubey
    Multimedia Tools and Applications, 2024, 83 : 37839 - 37858
  • [26] Leveraging Shapley Additive Explanations for Feature Selection in Ensemble Models for Diabetes Prediction
    Mohanty, Prasant Kumar
    Francis, Sharmila Anand John
    Barik, Rabindra Kumar
    Roy, Diptendu Sinha
    Saikia, Manob Jyoti
    BIOENGINEERING-BASEL, 2024, 11 (12):
  • [27] Enhancing link prediction through node embedding and ensemble learning
    Chen, Zhongyuan
    Wang, Yongji
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (12) : 7697 - 7715
  • [28] Enhancing Flood Prediction using Ensemble and Deep Learning Techniques
    Nti, Isaac Kofi
    Nyarko-Boateng, Owusu
    Boateng, Samuel
    Bawah, F. U.
    Agbedanu, P. R.
    Awarayi, N. S.
    Nimbe, P.
    Adekoya, A. F.
    Weyori, B. A.
    Akoto-Adjepong, Vivian
    2021 22ND INTERNATIONAL ARAB CONFERENCE ON INFORMATION TECHNOLOGY (ACIT), 2021, : 662 - 670
  • [29] Enhancing genomic prediction with Stacking Ensemble Learning in Arabica Coffee
    Nascimento, Moyses
    Nascimento, Ana Carolina Campana
    Azevedo, Camila Ferreira
    de Oliveira, Antonio Carlos Baiao
    Caixeta, Eveline Teixeira
    Jarquin, Diego
    FRONTIERS IN PLANT SCIENCE, 2024, 15
  • [30] An improved Stacking framework for stock index prediction by leveraging tree-based ensemble models and deep learning algorithms
    Jiang, Minqi
    Liu, Jiapeng
    Zhang, Lu
    Liu, Chunyu
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2020, 541