Enhancing Pollen Prediction in Beijing, a Chinese Megacity: Leveraging Ensemble Learning Models for Greater Accuracy

被引:0
|
作者
Ruan, Wenxi [1 ,2 ]
Li, Ziming [3 ]
Sun, Zhaobin [1 ,2 ]
An, Xingqin [1 ,2 ]
Zhao, Yuxin [1 ,2 ]
Zhang, Shuwen [4 ]
Liang, Yinglin [5 ]
Bu, Yaqin [6 ]
Xin, Jingyi [7 ]
Hang, Xiaoyi [7 ]
机构
[1] Chinese Acad Meteorol Sci, State Key Lab Severe Weather, Beijing 100081, Peoples R China
[2] Chinese Acad Meteorol Sci, Key Lab Atmospher Chem, CMA, Beijing 100081, Peoples R China
[3] Beijing Weather Forecast Ctr, Beijing 100089, Peoples R China
[4] Nanjing Univ Chinese Med, Coll Tradit Chinese Med, Nanjing 210023, Peoples R China
[5] Chengdu Univ Informat Technol, Sch Atmospher Sci, Chengdu 610225, Peoples R China
[6] Lanzhou Univ, Coll Earth & Environm Sci, Key Lab Western Chinas Environm Syst, Minist Educ, Lanzhou 730000, Peoples R China
[7] Beijing Univ Chinese Med, Sch Tradit Chinese Med, Beijing 100029, Peoples R China
基金
中国国家自然科学基金;
关键词
Machine learning; Forecasting; Pollen concentrations; Lead time; Time series analysis; AIRBORNE POLLEN; CLIMATE-CHANGE; CORYLUS; ALNUS;
D O I
10.4209/aaqr.240123
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
In North China, pollen stands as a leading allergen responsible for allergic rhinitis, with climate change exacerbating allergenic pollen sensitization and posing significant health risks to residents. Despite its critical importance, pollen forecasting technology is still not sufficiently optimized. This study leverages multi-year daily pollen concentration observations and ECMWF (European Centre for Medium-Range Weather Forecasts) real-time forecast data, applying twelve machine learning models to learn perturbations separated from characteristic quantities. Specifically, it forecasts pollen concentrations in Beijing, utilizing R 2 and RMSE as evaluation metrics. The findings reveal that the CatBoost, Extra Trees, and XGBoost algorithms perform well for three-day consecutive pollen predictions. Specifically, when considering a one-day prediction period, the R 2 values for these algorithms are 0.72, 0.73, and 0.73, respectively. In contrast, algorithms such as Neural Network, LightGBM, and K-nearest Neighbor demonstrate weaker performance, though all models except Neural NetTorch achieve R 2 values above 0.50. Notably, the prediction accuracy of Neural NetTorch significantly improves with extended prediction time, with its R 2 increasing from 0.34 to 0.67 as the prediction period extends from one day to three days. The Weighted Ensemble model, which adjusts other models based on weighted optimization to mitigate excessive peaks, consistently yields stable results with an R 2 exceeding 0.67. Furthermore, the study assesses the importance of feature groups within the model, indicating that pollen emission intensity and phenological characteristics are crucial for both training and testing phases, whereas meteorological factors predominantly influence pollen dispersion. Given the strong impact of meteorological conditions and nonlinear regulation on pollen, a type of bioaerosol, machine learning demonstrates substantial potential for simulating and predicting its concentrations.
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Enhancing game customer churn prediction with a stacked ensemble learning model
    Guo, Rui
    Xiong, Wen
    Zhang, Yungang
    Hu, Yanfang
    JOURNAL OF SUPERCOMPUTING, 2025, 81 (01):
  • [42] Improving the Accuracy of Financial Bankruptcy Prediction Using Ensemble Learning Techniques
    Njoku, Anthonia Oluchukwu
    Mpinda, Berthine Nyunga
    Awe, Olushina Olawale
    PAN-AFRICAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PT II, PANAFRICON AI 2023, 2024, 2069 : 3 - 29
  • [43] Enhancing intraday stock price manipulation detection by leveraging recurrent neural networks with ensemble learning
    Wang, Qili
    Xu, Wei
    Huang, Xinting
    Yang, Kunlin
    NEUROCOMPUTING, 2019, 347 : 46 - 58
  • [44] Enhancing Traffic Speed Prediction Accuracy: The Multialgorithmic Ensemble Model With Spatiotemporal Feature Engineering
    Ardestani, Ali
    Yang, Hao
    Razavi, Saiedeh
    JOURNAL OF ADVANCED TRANSPORTATION, 2025, 2025 (01)
  • [45] Leveraging Imbalance and Ensemble Learning Methods for Improved Load Prediction in Cloud Computing Systems
    Daraghmeh, Mustafa
    Agarwal, Anjali
    Jararweh, Yaser
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 1687 - 1692
  • [46] Enhancing Phishing Website Detection Using Ensemble Machine Learning Models
    Baliyan, Himanshu
    Prasath, A. Rama
    2024 OPJU International Technology Conference on Smart Computing for Innovation and Advancement in Industry 4.0, OTCON 2024, 2024,
  • [47] Enhancing Question Pairs Identification with Ensemble Learning: Integrating Machine Learning and Deep Learning Models
    Tarek, Salsabil
    Noaman, Hatem M.
    Kayed, Mohammed
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (11) : 981 - 992
  • [48] Implementing ensemble learning models for the prediction of shear strength of soil
    Rabbani A.
    Samui P.
    Kumari S.
    Asian Journal of Civil Engineering, 2023, 24 (7) : 2103 - 2119
  • [49] Early Prediction of Diabetes Using an Ensemble of Machine Learning Models
    Dutta, Aishwariya
    Hasan, Md Kamrul
    Ahmad, Mohiuddin
    Awal, Md Abdul
    Islam, Md Akhtarul
    Masud, Mehedi
    Meshref, Hossam
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (19)
  • [50] Ensemble machine learning models for aviation incident risk prediction
    Zhang, Xiaoge
    Mahadevan, Sankaran
    DECISION SUPPORT SYSTEMS, 2019, 116 : 48 - 63