IMPUTATION FOR CONSECUTIVE MISSING VALUES IN NON-STATIONARY TIME SERIES DATA

被引:3
|
作者
Wongoutong, Chantha [1 ]
机构
[1] Kasetsart Univ, Fac Sci, Dept Stat, Bangkok 10900, Thailand
关键词
imputation method; consecutive missing values; non-stationary time series;
D O I
10.17654/AS064010087
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Missing data have a significant effect on forecasting from time series data. Since many applications require complete data, missing values must be imputed before further data processing is possible. Several methods to account for missing data have been proposed, but an appropriate imputation method depends on the type of time series and the pattern of the missing data. Simple methods such as mean or moving average (MA) imputation do not perform well when handling missing values in complex situations as a non-stationary time series where both trend and seasonality exist. This study focuses on handling missing the non-stationary where both trend and seasonality exist with the pattern as consecutive missing values based on the deseasonalizing the data and then interpolation (DES-I) or Kalman (DES-K) imputation by using na_seadec in the imputeTS R package. Five real datasets were used to evaluate the performance of the imputation methods with three scenarios of missing artificial data sequences in the time series created at missing rates of 10%, 20% and 50%. The performances of traditional imputation methods such as interpolation, Kalman, MA, last observation carried forward, mean, and linear trend at point were compared with the DES-I and DSE-K. In terms of RMSE and MAPE, the performances of the two methods (DES-I and DSE-K) were far superior to the six traditional imputation methods in the order of 60-80%. Hence, deseasonalizing is a necessary process before imputing missing values for time series data exhibiting both trend and seasonality.
引用
收藏
页码:87 / 102
页数:16
相关论文
共 50 条
  • [1] An Exploration of Online Missing Value Imputation in Non-stationary Data Stream
    Dong W.
    Gao S.
    Yang X.
    Yu H.
    [J]. SN Computer Science, 2021, 2 (2)
  • [3] Deep imputation of missing values in time series health data: A review with benchmarking
    Kazijevs, Maksims
    Samad, Manar D.
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2023, 144
  • [4] A novel imputation method for missing values in air pollutant time series data
    Pena, Mario
    Ortega, Patricia
    Orellana, Marcos
    [J]. 2019 IEEE LATIN AMERICAN CONFERENCE ON COMPUTATIONAL INTELLIGENCE (LA-CCI), 2019, : 99 - 104
  • [5] Imputation of Missing Values in Time Series with Lagged Correlations
    Rahman, Shah Atiqur
    Huang, Yuxiao
    Claassen, Jan
    Kleinberg, Samantha
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2014, : 753 - 762
  • [6] A bagging algorithm for the imputation of missing values in time series
    Andiojaya, Agung
    Demirhan, Haydar
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 129 : 10 - 26
  • [7] Recurrent Imputation for Multivariate Time Series with Missing Values
    Suo, Qiuling
    Yao, Liuyi
    Xun, Guangxu
    Sun, Jianhui
    Zhang, Aidong
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI), 2019, : 562 - 564
  • [8] Extraction of dynamics from non-stationary time series data
    Cao, LY
    [J]. APPLIED NONLINEAR DYNAMICS AND STOCHASTIC SYSTEMS NEAR THE MILLENNIUM, 1997, (411): : 69 - 74
  • [9] Implementation of Fuzzy Time Series in Forecasting of the Non-Stationary Data
    Efendi, Riswan
    Deris, Mustafa Mat
    Ismail, Zuhaimy
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2016, 15 (02)
  • [10] Classification of non-stationary time series
    Krzemieniewska, Karolina
    Eckley, Idris A.
    Fearnhead, Paul
    [J]. STAT, 2014, 3 (01): : 144 - 157