Revitalizing temperature records: A novel framework towards continuous data reconstruction using univariate and multivariate imputation techniques

被引:0
|
作者
Kumar, Hanumapura Kumaraswamy Yashas [1 ]
Varija, Kumble [1 ]
机构
[1] Natl Inst Technol NIT, Dept Water Resources & Ocean Engn, Surathkal, Karnataka, India
关键词
Climate studies; Missing data imputation; Singular Spectrum Analysis; Multivariate techniques; Multi-criteria decision making; Statistical Indicators; SINGULAR SPECTRUM ANALYSIS; MISSING VALUE IMPUTATION; SOLAR-RADIATION; SPATIAL INTERPOLATION; VIKOR METHOD; MODEL; PERFORMANCE; SELECTION; CLIMATE; PRECIPITATION;
D O I
10.1016/j.atmosres.2024.107754
中图分类号
P4 [大气科学(气象学)];
学科分类号
0706 ; 070601 ;
摘要
Data gaps are a recurring challenge in climate research, hindering effective time series analysis and modeling. This study proposes a novel two-step data imputation framework to address temperature time series with a long continuous gap surrounded by predictor stations with sporadic missingness. The method leverages iterative gapfilling Singular Spectrum Analysis (SSA) for the small sporadic gaps, followed by multivariate techniques like Inverse Distance Weightage (IDW), Kriging, Spatial Regression Test (SRT), Point Estimation method of Biased Sentinel Hospital-based Area Disease Estimation (P-BSHADE), Random Forest (RF), Support Vector Machines (SVM), and MissForest (MF) for the longer gap. Once the sporadic gaps are effectively addressed with SSA, the method carefully applies multivariate techniques to impute the long continuous gap. Prioritizing accuracy, comprehensive cross-validation with class-based statistical indicators are employed to minimize any potential biases introduced by the imputation process. The study shows the effectiveness of SSA in filling small sporadic gaps using an optimal window length (M approximate to 365 days) and eigentriple grouping (ET = 30). Notably, for maximum temperature, P-BSHADE and SVM achieve an impressive accuracy (e.g., Legates's Coefficient of Efficiency (LCE), 0.75 0.44, Combined Performance Index (CPI), 6.3% 19.1%) attributed to their ability to capture spatial and/ or temporal heterogeneity. While SRT and P-BSHADE offers acceptable performance for minimum temperature (e.g., LCE, 0.51 0.27, CPI, 0.7% 23.7%), the study also uncovers a complex interplay between missing data, predictor stations, and autocorrelation affecting imputation accuracy. This suggests that the reduced performance of certain techniques likely stems from the decline in spatial and spatiotemporal autocorrelation between the target station and its predictors. Overall, this study presents a promising framework for handling complex missing data scenarios often encountered in climate time series analysis, paving the way for more robust and reliable analysis and modeling.
引用
收藏
页数:27
相关论文
共 14 条
  • [1] Salvaging Data Records with Missing Data: Data Imputation using the Multivariate t Distribution
    Hooke, Melissa
    Mrozinski, Joseph
    DiNicola, Michael
    2021 IEEE AEROSPACE CONFERENCE (AEROCONF 2021), 2021,
  • [2] Evaluation of Metabolomics Data Using Univariate and Multivariate Statistical Analysis Techniques
    Moroz, J.
    Fallone, G.
    Syme, A.
    Allalunis-Turner, J.
    MEDICAL PHYSICS, 2010, 37 (06) : 3471 - +
  • [3] Missing value imputation using decision trees and decision forests by splitting and merging records: Two novel techniques
    Rahman, Md Geaur
    Islam, Md Zahidul
    KNOWLEDGE-BASED SYSTEMS, 2013, 53 : 51 - 65
  • [4] Towards Mitigation of Data Exfiltration Techniques Using the MITRE ATT&CK Framework
    Mundt, Michael
    Baier, Harald
    DIGITAL FORENSICS AND CYBER CRIME, ICDF2C 2021, 2022, 441 : 139 - 158
  • [5] A Novel Attitude Representation in View of Spacecraft Attitude Reconstruction using Temperature Data
    Posielek, Tobias
    Reger, Johann
    IFAC PAPERSONLINE, 2021, 54 (14): : 500 - 505
  • [6] Towards Design of a Novel Android Malware Detection Framework Using Hybrid Deep Learning Techniques
    Dhabal, Gourab
    Gupta, Govind
    SOFT COMPUTING FOR SECURITY APPLICATIONS, ICSCS 2022, 2023, 1428 : 181 - 193
  • [7] A novel method for the recovery of continuous missing data using multivariate variational mode decomposition and fully convolutional networks
    Tang, Qizhi
    Jiang, Yan
    Xin, Jingzhou
    Liao, Gaofeng
    Zhou, Jianting
    Yang, Xianyi
    MEASUREMENT, 2023, 220
  • [8] A generalized modeling of ill-posed inverse reconstruction of images using a novel data-driven framework
    Bilal, Mohsin
    Arif, Muhammad
    SIGNAL IMAGE AND VIDEO PROCESSING, 2020, 14 (02) : 333 - 341
  • [9] A generalized modeling of ill-posed inverse reconstruction of images using a novel data-driven framework
    Mohsin Bilal
    Muhammad Arif
    Signal, Image and Video Processing, 2020, 14 : 333 - 341
  • [10] Reconstruction of Continuous High-Resolution Sea Surface Temperature Data Using Time-Aware Implicit Neural Representation
    Wang, Yang
    Karimi, Hassan A.
    Jia, Xiaowei
    REMOTE SENSING, 2023, 15 (24)