An evaluation of methods for imputation of missing trace element data in groundwaters

被引:24
|
作者
Dickson, Bruce L.
Giblin, Angela M.
机构
[1] Dickson Res Pty Ltd, Gladesville, NSW 2111, Australia
[2] CSIRO, N Ryde, NSW 2113, Australia
关键词
groundwater; uranium; self-organizing map; expectation maximization; imputation; Murray Basin; evaporation ponds;
D O I
10.1144/1467-7873/07-127
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Groundwater data-sets with pH and major cation-anion chemistry are widely available but data that include trace metals are much rarer. This paper examines two methods of data imputation to predict U concentrations using pH, major cations, anions and F in a data-set where some of the U concentrations are missing. The methods evaluated were self-organizing maps (SOM) and expectation maximization (EN. Evaluations were made using a groundwater data-set of 187 samples from NSW and Victoria, which contained a wide range of U concentrations up to 225 mu g/l. Tests made by setting 25% and 50% of the U concentrations to missing showed that, at 25% missing, SOM gave reasonable estimates, identifying all the samples with higher U. EM did not clearly identify the higher samples. At 50% missing, neither method could accurately identify the higher U concentrations. Thus, imputation using samples with missing data included in the training data-set does not appear to be practical. However, a SOM pre-trained on a data-set with no missing U concentrations may be used to impute U concentrations for samples with 100% missing U data. Training using the original data-set and then imputing concentrations for a second set of 360 samples showed that the samples with higher measured U concentrations could generally be identified, but that other samples were also estimated to be U-rich. This method could substantially reduce the number of samples in a large data-set requiring further investigation. The performance of imputation for U reflects the complex interaction of water chemistry, geology and mineralogy that actually determines the U concentrations. Imputation is a useful method for improving estimates of data statistics. SOM, through its model-free approach, is a useful addition to the numerical analysis toolbox for geochemists.
引用
收藏
页码:173 / 178
页数:6
相关论文
共 50 条
  • [1] Imputation methods for missing data in educational diagnostic evaluation
    Fernandez-Alonso, Ruben
    Suarez-Alvarez, Javier
    Muniz, Jose
    [J]. PSICOTHEMA, 2012, 24 (01) : 167 - 175
  • [2] Missing Data and Imputation Methods
    Schober, Patrick
    Vetter, Thomas R.
    [J]. ANESTHESIA AND ANALGESIA, 2020, 131 (05): : 1419 - 1420
  • [3] Evaluation of missing data imputation methods for human osteometric measurements
    Liu, Xiaoming
    Pang, Jinyong
    [J]. AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY, 2024, 183 : 103 - 104
  • [4] Missing data and imputation methods in partition of variables
    da Silva, AL
    Saporta, G
    Bacelar-Nicolau, H
    [J]. CLASSIFICATION, CLUSTERING, AND DATA MINING APPLICATIONS, 2004, : 631 - 637
  • [5] Imputation methods for missing data for polygenic models
    Brooke Fridley
    Kari Rabe
    Mariza de Andrade
    [J]. BMC Genetics, 4
  • [6] Imputation of missing longitudinal data: a comparison of methods
    Engels, JM
    Diehr, P
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 2003, 56 (10) : 968 - 976
  • [7] Imputation methods for missing data for polygenic models
    Fridley, B
    Rabe, K
    de Andrade, M
    [J]. BMC GENETICS, 2003, 4 (Suppl 1)
  • [8] Missing traffic data: comparison of imputation methods
    Li, Yuebiao
    Li, Zhiheng
    Li, Li
    [J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2014, 8 (01) : 51 - 57
  • [9] Analyzing data sets with missing data: An empirical evaluation of imputation methods and likelihood-based methods
    Myrtveit, I
    Stensrud, E
    Olsson, UH
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 2001, 27 (11) : 999 - 1013
  • [10] Technical note: Evaluation of missing data imputation methods for human osteometric measurements
    Pang, Jinyong
    Liu, Xiaoming
    [J]. AMERICAN JOURNAL OF BIOLOGICAL ANTHROPOLOGY, 2023, 181 (04): : 666 - 676