Clustering Imputation for Air Pollution Data

被引:3
|
作者
Alahamade, Wedad [1 ,3 ]
Lake, Iain [1 ]
Reeves, Claire E. [2 ]
De la Iglesia, Beatriz [1 ]
机构
[1] Univ East Anglia, Norwich Res Pk, Norwich NR4 7TJ, Norfolk, England
[2] Univ East Anglia, Ctr Ocean & Atmospher Sci, Sch Environm Sci, Norwich, Norfolk, England
[3] Taibah Univ, Medina, Saudi Arabia
关键词
Air quality; Uncertainty; Time series clustering; Imputation; MISSING VALUES; HEALTH;
D O I
10.1007/978-3-030-61705-9_48
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Air pollution is a global problem. The assessment of air pollution concentration data is important for evaluating human exposure and the associated risk to health. Unfortunately, air pollution monitoring stations often have periods of missing data or do not measure all pollutants. In this study, we experiment with different approaches to estimate the whole time series for a missing pollutant at a monitoring station as well as missing values within a time series. The main goal is to reduce the uncertainty in air quality assessment. To develop our approach we combine single and multiple imputation, nearest neighbour geographical distance methods and a clustering algorithm for time series. For each station that measures ozone, we produce various imputations for this pollutant and measure the similarity/error between the imputed and the real values. Our results show that imputation by average based on clustering results combined with multiple imputation for missing values is the most reliable and is associated with lower average error and standard deviation.
引用
收藏
页码:585 / 597
页数:13
相关论文
共 50 条
  • [21] Imputation method for lifetime exposure assessment in air pollution epidemiologic studies
    Jan Beyea
    Steven D Stellman
    Susan Teitelbaum
    Irina Mordukhovich
    Marilie D Gammon
    Environmental Health, 12
  • [22] Imputation method for lifetime exposure assessment in air pollution epidemiologic studies
    Beyea, Jan
    Stellman, Steven D.
    Teitelbaum, Susan
    Mordukhovich, Irina
    Gammon, Marilie D.
    ENVIRONMENTAL HEALTH, 2013, 12
  • [23] Environmental air pollution clustering using enhanced ensemble clustering methodology
    Soundararaj Vandhana
    Jagadeesan Anuradha
    Environmental Science and Pollution Research, 2021, 28 : 40746 - 40755
  • [24] Environmental air pollution clustering using enhanced ensemble clustering methodology
    Vandhana, Soundararaj
    Anuradha, Jagadeesan
    ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2021, 28 (30) : 40746 - 40755
  • [25] Data imputation via evolutionary computation, clustering and a neural network
    Gautam, Chandan
    Ravi, Vadlamani
    NEUROCOMPUTING, 2015, 156 : 134 - 142
  • [26] Imputation method for missing data based on clustering and measure of property
    Kim, Sunghyun
    Kim, Dongjae
    KOREAN JOURNAL OF APPLIED STATISTICS, 2018, 31 (01) : 29 - 40
  • [27] A direct clustering method for imperfect microarray data without imputation
    Yun, Taegyun
    Kim, Suyoung
    Hwang, Taeho
    Yi, Gwan-Su
    PROCEEDINGS OF THE FRONTIERS IN THE CONVERGENCE OF BIOSCIENCE AND INFORMATION TECHNOLOGIES, 2007, : 183 - 187
  • [28] An Imputation-Based Method for Fuzzy Clustering of Incomplete Data
    Soni, S.
    Sharma, I.
    2017 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), 2017, : 616 - 621
  • [29] Partial distance evidential clustering for missing data with multiple imputation
    Tian, Hong-Peng
    Zhang, Zhen
    KNOWLEDGE-BASED SYSTEMS, 2025, 310
  • [30] Towards clustering of incomplete microarray data without the use of imputation
    Kim, Dae-Won
    Lee, Ki-Young
    Lee, Kwang H.
    Lee, Doheon
    BIOINFORMATICS, 2007, 23 (01) : 107 - 113