Machine Learning Approach-based Big Data Imputation Methods for Outdoor Air Quality forecasting

被引:1
|
作者
Narasimhan, D. [1 ]
Vanitha, M. [2 ]
机构
[1] SASTRA Deemed Univ, Dept Math, Kumbakonam 612001, Tamil Nadu, India
[2] SASTRA Deemed Univ, Srinivasa Ramanujan Ctr, Dept Comp Sci & Engn, Kumbakonam 612001, Tamil Nadu, India
来源
关键词
Air quality; Big data analytics; Classification; Ensemble; Multiple imputation;
D O I
10.56042/jsir.v82i03.71764
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Missing data from ambient air databases is a typical issue, but it is much worse in small towns or cities. Missing data is a significant concern for environmental epidemiology. These settings have high pollution exposure levels worldwide, and dataset gaps obstruct health investigations that could later affect local and international policies. When a substantial number of observations contain missing values, the standard errors increase due to the smaller sample size, which may significantly affect the final result. Generally, the performance of various missing value imputation algorithms is proportional to the size of the database and the percentage of missing values within it. This paper proposes and demonstrates an ensemble - imputation - classification framework approach to rebuild air quality information using a dataset from Beijing, China, to forecast air quality. Various single and multiple imputation procedures are utilized to fill the missing records. Then ensemble of diverse classifiers is used on the imputed data to find the air pollution level. The recommended model aims to reduce the error rate and improve accuracy. Extensive testing of datasets with actual missing values has revealed that the suggested methodology significantly enhances the air quality forecasting model's accuracy with multiple imputation and ensemble techniques when compared to other conventional single imputation techniques.
引用
收藏
页码:338 / 347
页数:10
相关论文
共 50 条
  • [41] Hot metal quality monitoring system based on big data and machine learning
    Ran Liu
    Zhi-feng Zhang
    Xin Li
    Xiao-jie Liu
    Hong-yang Li
    Xiang-ping Bu
    Jun Zhao
    Qing Lyu
    Journal of Iron and Steel Research International, 2023, 30 : 915 - 925
  • [42] Forecasting of Stock Market by Combining Machine Learning and Big Data Analytics
    Dhas, J. L. Joneston
    Vigila, S. Maria Celestin
    Star, C. Ezhil
    SOFT COMPUTING SYSTEMS, ICSCS 2018, 2018, 837 : 385 - 395
  • [43] Machine Learning Approach-Based Gamma Distribution In for Brain Tumor Detection and Data Sample imbalance Analysis
    Manogaran, Gunasekaran
    Shakeel, P. Mohamed
    Hassanein, Azza S.
    Kumar, Priyan Malarvizhi
    Babu, Gokulnath Chandra
    IEEE ACCESS, 2019, 7 : 12 - 19
  • [44] Machine Learning with Big Data An Efficient Electricity Generation Forecasting System
    Rahman, Mohammad Naimur
    Esmailpour, Amir
    Zhao, Junhui
    BIG DATA RESEARCH, 2016, 5 : 9 - 15
  • [45] A Machine Learning Approach for NDVI Forecasting based on Sentinel-2 Data
    Cavalli, Stefano
    Penzotti, Gabriele
    Amoretti, Michele
    Caselli, Stefano
    PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON SOFTWARE TECHNOLOGIES (ICSOFT), 2021, : 473 - 480
  • [46] Machine Learning Based Missing Data Imputation in Categorical Datasets
    Ishaq, Muhammad
    Zahir, Sana
    Iftikhar, Laila
    Bulbul, Mohammad Farhad
    Rho, Seungmin
    Lee, Mi Young
    IEEE ACCESS, 2024, 12 : 88332 - 88344
  • [47] A Deep Learning Based Approach for Traffic Data Imputation
    Duan, Yanjie
    Lv, Yisheng
    Kang, Wenwen
    Zhao, Yifei
    2014 IEEE 17TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC), 2014, : 912 - 917
  • [48] Learning Support Methods based on Predictive Control Using Machine Learning for Educational Big Data
    Abe, Keisuke
    Cheng, Kai
    2020 59TH ANNUAL CONFERENCE OF THE SOCIETY OF INSTRUMENT AND CONTROL ENGINEERS OF JAPAN (SICE), 2020, : 1494 - 1499
  • [49] A Practical Model for Traffic Forecasting based on Big Data, Machine-learning, and Network KPIs
    Le, Luong-Vy
    Sinh, Do
    Tung, Li-Ping
    Lin, Bao-Shuh Paul
    2018 15TH IEEE ANNUAL CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE (CCNC), 2018,
  • [50] A novel seasonal index-based machine learning approach for air pollution forecasting
    Khan, Adeel
    Sharma, Sumit
    Chowdhury, Kaushik Roy
    Sharma, Prateek
    ENVIRONMENTAL MONITORING AND ASSESSMENT, 2022, 194 (06)