Reconstructing missing data by comparing interpolation techniques: Applications for long-term water quality data

被引:10
|
作者
Larson, Danelle M. [1 ]
Bungula, Wako [2 ]
Lee, Amber [3 ]
Stockdill, Alaina [3 ]
McKean, Casey [3 ]
Miller, Frederick Forrest [3 ]
Davis, Killian [3 ]
Erickson, Richard A. [1 ]
Hlavacek, Enrika [1 ]
机构
[1] US Geol Survey, Upper Midwest Environm Sci Ctr, La Crosse, WI 54603 USA
[2] Univ Wisconsin La Crosse, Dept Math & Stat, La Crosse, WI USA
[3] Univ Wisconsin La Crosse, Res Experience Undergraduates Program, La Crosse, WI USA
来源
LIMNOLOGY AND OCEANOGRAPHY-METHODS | 2023年 / 21卷 / 07期
关键词
MACHINE LEARNING-METHODS; SPATIAL INTERPOLATION; RIVER;
D O I
10.1002/lom3.10556
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Missing data are typical yet must be addressed for proper inferences or expanding datasets to guide our limnological understanding and management of aquatic systems. Interpolation methods (i.e., estimating missing values using known values within the dataset) can alleviate data gaps and common problems. We compared seven popular interpolation methods for predicting substantial missingness in a long-term water quality dataset from the Upper Mississippi River, U.S.A. The dataset included 80,000 sampling sites collected over 30 yr that had substantial missingness for total nitrogen (TN), total phosphorus (TP), and water velocity. For all three interpolated water quality variables, random forests had very high prediction accuracy and outperformed the methods of ordinary kriging, polynomial regressions, regression trees, and inverse distance weighting. TP had a mean absolute error (MAE) of 0.03 mg (L-TP)(-1), TN had a MAE of 0.39 mg (L-TN)(-1), and water velocity had a MAE of 0.10 m s(-1). The random forests' error rates were mapped and showed low spatiotemporal variability across the riverscape, indicating high model performance across many habitat types and large spatial scales. In the current era of "big data," interpolation becomes an imperative step prior to ecological analyses yet remains unfamiliar and underutilized. Our research briefly describes the importance of addressing missingness and provides a roadmap to conduct model intercomparisons of other big datasets. We also share adaptable data analysis scripts, which allows others to readily conduct interpolation comparisons for many limnology applications and contexts.
引用
收藏
页码:435 / 449
页数:15
相关论文
共 50 条
  • [1] Missing data and dropouts in a long-term, double-blind schizophrenia study comparing ziprasidone and haloperidol
    Siu, C.
    Potkin, S.
    Pappadopulos, E.
    INTERNATIONAL JOURNAL OF NEUROPSYCHOPHARMACOLOGY, 2008, 11 : 161 - 161
  • [2] A Review of Deep Learning-based Trace Interpolation and Extrapolation Techniques for Reconstructing Missing Near Offset Data
    Park, Jiho
    Seol, Soon Jee
    Byun, Joongmoo
    GEOPHYSICS AND GEOPHYSICAL EXPLORATION, 2023, 26 (04): : 185 - 198
  • [3] Missing data and interpolation in dynamic term structure models
    Pavlov, V
    Contemporary Issues in Economics and Econometrics: Theory and Application, 2004, : 162 - 175
  • [4] Commentary: The curse of missing long-term data in cardiac surgery
    Strobel, Raymond J.
    Mehaffey, J. Hunter
    Hawkins, Robert B.
    JOURNAL OF THORACIC AND CARDIOVASCULAR SURGERY, 2022, 164 (05): : 1529 - 1530
  • [5] Missing Data Imputation Strategies on Long-Term Quality of Life Outcome after Heart Transplant
    Wang, E. C.
    Grady, K. L.
    Rybarczyk, B.
    Naftel, D. C.
    Myers, S.
    Kirklin, J. K.
    Young, J. B.
    Pelegrin, D.
    Czerr, J.
    Kobashigawa, J.
    Chait, J.
    Heroux, A.
    Higgins, R.
    White-Williams, C.
    JOURNAL OF HEART AND LUNG TRANSPLANTATION, 2009, 28 (02): : S270 - S270
  • [6] Restoration of Missing Data in Greek Folk Music by Interpolation Techniques
    Medentzidou, Paschalina
    Kotropoulos, Constantine
    2016 DIGITAL MEDIA INDUSTRY AND ACADEMIC FORUM (DMIAF), 2016, : 107 - 112
  • [7] INTEGRATION OF LONG-TERM FISH KILL DATA WITH AMBIENT WATER-QUALITY MONITORING DATA AND APPLICATION TO WATER-QUALITY MANAGEMENT
    TRIM, AH
    MARCUS, JM
    ENVIRONMENTAL MANAGEMENT, 1990, 14 (03) : 389 - 396
  • [8] Handling of missing data in long-term clinical trials: a case study
    Janssens, Mark
    Molenberghs, Geert
    Kerstens, Rene
    PHARMACEUTICAL STATISTICS, 2012, 11 (06) : 442 - 448
  • [9] Data-Quality Improvements and Applications of Long-Term Monitoring of Ionospheric Anomalies for GBAS
    Kim, Minchan
    Lee, Jiyun
    Pullen, Sam
    Gillespie, Joseph
    PROCEEDINGS OF THE 25TH INTERNATIONAL TECHNICAL MEETING OF THE SATELLITE DIVISION OF THE INSTITUTE OF NAVIGATION (ION GNSS 2012), 2012, : 2159 - 2174
  • [10] Comparing Phylogenetic Approaches to Reconstructing Cell Lineage From Microsatellites With Missing Data
    Lyne, Anne-Marie
    Perie, Leila
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (06) : 2291 - 2301