Machine Learning Approach-based Big Data Imputation Methods for Outdoor Air Quality forecasting

被引:1
|
作者
Narasimhan, D. [1 ]
Vanitha, M. [2 ]
机构
[1] SASTRA Deemed Univ, Dept Math, Kumbakonam 612001, Tamil Nadu, India
[2] SASTRA Deemed Univ, Srinivasa Ramanujan Ctr, Dept Comp Sci & Engn, Kumbakonam 612001, Tamil Nadu, India
来源
关键词
Air quality; Big data analytics; Classification; Ensemble; Multiple imputation;
D O I
10.56042/jsir.v82i03.71764
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Missing data from ambient air databases is a typical issue, but it is much worse in small towns or cities. Missing data is a significant concern for environmental epidemiology. These settings have high pollution exposure levels worldwide, and dataset gaps obstruct health investigations that could later affect local and international policies. When a substantial number of observations contain missing values, the standard errors increase due to the smaller sample size, which may significantly affect the final result. Generally, the performance of various missing value imputation algorithms is proportional to the size of the database and the percentage of missing values within it. This paper proposes and demonstrates an ensemble - imputation - classification framework approach to rebuild air quality information using a dataset from Beijing, China, to forecast air quality. Various single and multiple imputation procedures are utilized to fill the missing records. Then ensemble of diverse classifiers is used on the imputed data to find the air pollution level. The recommended model aims to reduce the error rate and improve accuracy. Extensive testing of datasets with actual missing values has revealed that the suggested methodology significantly enhances the air quality forecasting model's accuracy with multiple imputation and ensemble techniques when compared to other conventional single imputation techniques.
引用
收藏
页码:338 / 347
页数:10
相关论文
共 50 条
  • [1] An Improved Air Quality Index Machine Learning-Based Forecasting with Multivariate Data Imputation Approach
    Alkabbani, Hanin
    Ramadan, Ashraf
    Zhu, Qinqin
    Elkamel, Ali
    ATMOSPHERE, 2022, 13 (07)
  • [2] Air Quality Forecasting Using Big Data and Machine Learning Algorithms
    Koo, Youn-Seo
    Choi, Yunsoo
    Ho, Chang-Hoi
    ASIA-PACIFIC JOURNAL OF ATMOSPHERIC SCIENCES, 2023, 59 (05) : 529 - 530
  • [3] Air Quality Forecasting Using Big Data and Machine Learning Algorithms
    Youn-Seo Koo
    Yunsoo Choi
    Chang‐Hoi Ho
    Asia-Pacific Journal of Atmospheric Sciences, 2023, 59 : 529 - 530
  • [4] Air quality data analysis and forecasting platform based on big data
    Wang, Jinghan
    Zhang, Jinnan
    Yuan, XueGuang
    Tang, Yu
    Hao, Hongyu
    Zuo, Yong
    Tan, Zebin
    Qiao, Min
    Cao, Yang Hua
    Ai, Lingmei
    Wan, Yihang
    Chen, Hao
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 2042 - 2046
  • [5] Machine Learning Based Approaches for Imputation in Time Series Data and their Impact on Forecasting
    Saad, Muhammad
    Chaudhary, Mohita
    Karray, Fakhri
    Gaudet, Vincent
    2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 2621 - 2627
  • [6] Forecasting Fine-Grained Air Quality Based on Big Data
    Zheng, Yu
    Yi, Xiuwen
    Li, Ming
    Li, Ruiyuan
    Shan, Zhangqing
    Chang, Eric
    Li, Tianrui
    KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, : 2267 - 2276
  • [7] Spectral methods for imputation of missing air quality data
    Shai Moshenberg
    Uri Lerner
    Barak Fishbain
    Environmental Systems Research, 4 (1)
  • [8] Big Data and Machine Learning Framework for Temperature Forecasting
    Mekala A.
    Baishya B.K.
    Rao K.T.V.
    Vidhate D.A.
    Drave V.A.
    Prasanth P.V.
    EAI Endorsed Transactions on Energy Web, 2023, 10
  • [9] Energetic Map Data Imputation: A Machine Learning Approach
    Straub, Tobias
    Nagy, Madalina Mandy
    Sidorov, Maxim
    Tonetto, Leonardo
    Frey, Michael
    Gauterin, Frank
    ENERGIES, 2020, 13 (04)
  • [10] Seasonal Tourism Demand Forecasting Based on Machine Learning in Big Data Environment
    Li, Jing
    Cao, Bin
    Journal of Network Intelligence, 2024, 9 (02): : 1032 - 1045