An Improved Air Quality Index Machine Learning-Based Forecasting with Multivariate Data Imputation Approach

被引:20
|
作者
Alkabbani, Hanin [1 ]
Ramadan, Ashraf [2 ]
Zhu, Qinqin [1 ]
Elkamel, Ali [1 ]
机构
[1] Univ Waterloo, Dept Chem Engn, 200 Univ Ave West, Waterloo, ON N2L 3G1, Canada
[2] Kuwait Inst Sci Res, Environm & Life Sci Res Ctr, Environm Pollut & Climate Program, POB 24885, Safat 13109, Kuwait
基金
加拿大自然科学与工程研究理事会;
关键词
ambient air quality observations; AQI; artificial neural network; machine learning; missForest imputation; forecasting; ARTIFICIAL NEURAL-NETWORKS; HYBRID ARIMA; PREDICTION; FINE; POLLUTION; MODEL; PARTICLES; MORTALITY; ENERGY; SAND;
D O I
10.3390/atmos13071144
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Accurate, timely air quality index (AQI) forecasting helps industries in selecting the most suitable air pollution control measures and the public in reducing harmful exposure to pollution. This article proposes a comprehensive method to forecast AQIs. Initially, the work focused on predicting hourly ambient concentrations of PM2.5 and PM10 using artificial neural networks. Once the method was developed, the work was extended to the prediction of other criteria pollutants, i.e., O-3,O- SO2, NO2, and CO, which fed into the process of estimating AQI. The prediction of the AQI not only requires the selection of a robust forecasting model, it also heavily relies on a sequence of pre-processing steps to select predictors and handle different issues in data, including gaps. The presented method dealt with this by imputing missing entries using missForest, a machine learning-based imputation technique which employed the random forest (RF) algorithm. Unlike the usual practice of using RF at the final forecasting stage, we utilized RF at the data pre-processing stage, i.e., missing data imputation and feature selection, and we obtained promising results. The effectiveness of this imputation method was examined against a linear imputation method for the six criteria pollutants and the AQI. The proposed approach was validated against ambient air quality observations for Al-Jahra, a major city in Kuwait. Results obtained showed that models trained using missForest-imputed data could generalize AQI forecasting and with a prediction accuracy of 92.41% when tested on new unseen data, which is better than earlier findings.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Machine Learning Approach-based Big Data Imputation Methods for Outdoor Air Quality forecasting
    Narasimhan, D.
    Vanitha, M.
    [J]. JOURNAL OF SCIENTIFIC & INDUSTRIAL RESEARCH, 2023, 82 (03): : 338 - 347
  • [2] Air Quality Index Forecasting via Genetic Algorithm-Based Improved Extreme Learning Machine
    Liu, Chunhao
    Pan, Guangyuan
    Song, Dongming
    Wei, Hao
    [J]. IEEE ACCESS, 2023, 11 : 67086 - 67097
  • [3] Machine learning-based prediction of air quality index and air quality grade: a comparative analysis
    S. A. Aram
    E. A. Nketiah
    B. M. Saalidong
    H. Wang
    A.-R. Afitiri
    A. B. Akoto
    P. O. Lartey
    [J]. International Journal of Environmental Science and Technology, 2024, 21 : 1345 - 1360
  • [4] Machine learning-based prediction of air quality index and air quality grade: a comparative analysis
    Aram, S. A.
    Nketiah, E. A.
    Saalidong, B. M.
    Wang, H.
    Afitiri, A. -R.
    Akoto, A. B.
    Lartey, P. O.
    [J]. INTERNATIONAL JOURNAL OF ENVIRONMENTAL SCIENCE AND TECHNOLOGY, 2024, 21 (02) : 1345 - 1360
  • [5] A Machine Learning-Based Missing Data Imputation with FHIR Interoperability Approach in Sepsis Prediction
    Toro Beltran, Cristian Fernando
    Villarreal Ibanez, Erick Daniel
    Milen Orejuela, Vivian
    Garcia Henao, John Anderson
    [J]. HIGH PERFORMANCE COMPUTING, CARLA 2022, 2022, 1660 : 116 - 130
  • [6] A novel seasonal index–based machine learning approach for air pollution forecasting
    Adeel Khan
    Sumit Sharma
    Kaushik Roy Chowdhury
    Prateek Sharma
    [J]. Environmental Monitoring and Assessment, 2022, 194
  • [7] Machine Learning-Based Prediction of Air Quality
    Liang, Yun-Chia
    Maimury, Yona
    Chen, Angela Hsiang-Ling
    Juarez, Josue Rodolfo Cuevas
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (24): : 1 - 17
  • [8] A Machine Learning-Based Data Fusion Approach for Improved Corrosion Testing
    Christoph Völker
    Sabine Kruschwitz
    Gino Ebell
    [J]. Surveys in Geophysics, 2020, 41 : 531 - 548
  • [9] A Machine Learning-Based Data Fusion Approach for Improved Corrosion Testing
    Voelker, Christoph
    Kruschwitz, Sabine
    Ebell, Gino
    [J]. SURVEYS IN GEOPHYSICS, 2020, 41 (03) : 531 - 548
  • [10] Machine learning-based imputation soft computing approach for large missing scale and non-reference data imputation
    Alamoodi, A. H.
    Zaidan, B. B.
    Zaidan, A. . A. .
    Albahri, O. S.
    Chen, Juliana
    Chyad, M. A.
    Garfan, Salem
    Aleesa, A. M.
    [J]. CHAOS SOLITONS & FRACTALS, 2021, 151