Reconstructing missing data sequences in multivariate time series: an application to environmental data

被引:7
|
作者
Parrella, Maria Lucia [1 ]
Albano, Giuseppina [1 ]
La Rocca, Michele [1 ]
Perna, Cira [1 ]
机构
[1] Univ Salerno, Dip Sci Econ & Stat, Salerno, Italy
来源
STATISTICAL METHODS AND APPLICATIONS | 2019年 / 28卷 / 02期
关键词
Spatial correlation; Missing values; PM10; data; Time series; AIR-POLLUTION; IMPUTATION; PROGRAM; QUALITY; VALUES;
D O I
10.1007/s10260-018-00435-9
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Missing data arise in many statistical analyses, due to faults in data acquisition, and can have a significant effect on the conclusions that can be drawn from the data. In environmental data, for example, a standard approach usually adopted by the Environmental Protection Agencies to handle missing values is by deleting those observations with incomplete information from the study, obtaining a massive underestimation of many indexes usually used for evaluating air quality. In multivariate time series, moreover, it may happen that not only isolated values but also long sequences of some of the time series' components may miss. In such cases, it is quite impossible to reconstruct the missing sequences basing on the serial dependence structure alone. In this work, we propose a new procedure that aims to reconstruct the missing sequences by exploiting the spatial correlation and the serial correlation of the multivariate time series, simultaneously. The proposed procedure is based on a spatial-dynamic model and imputes the missing values in the time series basing on a linear combination of the neighbor contemporary observations and their lagged values. It is specifically oriented to spatio-temporal data, although it is general enough to be applied to generic stationary multivariate time-series. In this paper, the procedure has been applied to the pollution data, where the problem of missing sequences is of serious concern, with remarkably satisfactory performance.
引用
收藏
页码:359 / 383
页数:25
相关论文
共 50 条
  • [31] Application of Two-Directional Time Series Models to Replace Missing Data
    Huo, Jinsheng
    Cox, Chris D.
    Seaver, William L.
    Robinson, R. Bruce
    Jiang, Yan
    [J]. JOURNAL OF ENVIRONMENTAL ENGINEERING, 2010, 136 (04) : 435 - 443
  • [32] Clustering of multivariate time-series data
    Singhal, A
    Seborg, DE
    [J]. PROCEEDINGS OF THE 2002 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2002, 1-6 : 3931 - 3936
  • [33] Clustering multivariate time-series data
    Singhal, A
    Seborg, DE
    [J]. JOURNAL OF CHEMOMETRICS, 2005, 19 (08) : 427 - 438
  • [34] Multivariate time series models for mixed data
    Debaly, Zinsou-Max
    Truquet, Lionel
    [J]. BERNOULLI, 2023, 29 (01) : 669 - 695
  • [35] Temporal data mining for multivariate time series
    Guimaraes, G
    [J]. IC-AI'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 1-III, 2000, : 1379 - 1385
  • [36] Interactive Visualization of Multivariate Time Series Data
    Martin, Shawn
    Quach, Tu-Toan
    [J]. FOUNDATIONS OF AUGMENTED COGNITION: NEUROERGONOMICS AND OPERATIONAL NEUROSCIENCE, PT II, 2016, 9744 : 322 - 332
  • [37] Graph neural networks for multivariate time series regression with application to seismic data
    Bloemheuvel, Stefan
    van den Hoogen, Jurgen
    Jozinovic, Dario
    Michelini, Alberto
    Atzmueller, Martin
    [J]. INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2023, 16 (03) : 317 - 332
  • [38] Graph neural networks for multivariate time series regression with application to seismic data
    Stefan Bloemheuvel
    Jurgen van den Hoogen
    Dario Jozinović
    Alberto Michelini
    Martin Atzmueller
    [J]. International Journal of Data Science and Analytics, 2023, 16 : 317 - 332
  • [39] Time Series Data and Recent Imputation Techniques for Missing Data: A Review
    Zainuddin, Aznilinda
    Hairuddin, Muhammad Asraf
    Yassin, Ahmad Ihsan Mohd
    Abd Latiff, Zatul Iffah
    Azhar, Aziemah
    [J]. 2022 INTERNATIONAL CONFERENCE ON GREEN ENERGY, COMPUTING AND SUSTAINABLE TECHNOLOGY (GECOST), 2022, : 346 - 350
  • [40] Comparison methods of estimating missing data in real data time series
    Tasho, Eljona Milo
    Zeqo, Lorena Margo
    [J]. ASIAN-EUROPEAN JOURNAL OF MATHEMATICS, 2022, 15 (10)