Spatial matrix completion for spatially misaligned and high-dimensional air pollution data

被引:1
|
作者
Vu, Phuong T. [1 ]
Szpiro, Adam A. [1 ]
Simon, Noah [1 ]
机构
[1] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
关键词
low-rank matrix completion; principal component analysis; proximal algorithm; spatial prediction; OLDER-ADULTS; PM2.5;
D O I
10.1002/env.2713
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
In health-pollution cohort studies, accurate predictions of pollutant concentrations at new locations are needed, since the locations of fixed monitoring sites and study participants are often spatially misaligned. For multi-pollution data, principal component analysis (PCA) is often incorporated to obtain low-rank (LR) structure of the data prior to spatial prediction. Recently developed predictive PCA modifies the traditional algorithm to improve the overall predictive performance by leveraging both LR and spatial structures within the data. However, predictive PCA requires complete data or an initial imputation step. Nonparametric imputation techniques without accounting for spatial information may distort the underlying structure of the data, and thus further reduce the predictive performance. We propose a convex optimization problem inspired by the LR matrix completion framework and develop a proximal algorithm to solve it. Missing data are imputed and handled concurrently within the algorithm, which eliminates the necessity of a separate imputation step. We review the connections among those existing methods developed for spatially misaligned multivariate data, and show that our algorithm has lower computational burden and leads to reliable predictive performance as the severity of missing data increases.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Probabilistic predictive principal component analysis for spatially misaligned and high-dimensional air pollution data with missing observations
    Vu, Phuong T.
    Larson, Timothy, V
    Szpiro, Adam A.
    [J]. ENVIRONMETRICS, 2020, 31 (04)
  • [2] Handling high-dimensional data in air pollution forecasting tasks
    Domanska, Diana
    Lukasik, Szymon
    [J]. ECOLOGICAL INFORMATICS, 2016, 34 : 70 - 91
  • [3] A novel principal component analysis for spatially misaligned multivariate air pollution data
    Jandarov, Roman A.
    Sheppard, Lianne A.
    Sampson, Paul D.
    Szpiro, Adam A.
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2017, 66 (01) : 3 - 28
  • [4] Gaussian Approximation and Spatially Dependent Wild Bootstrap for High-Dimensional Spatial Data
    Kurisu, Daisuke
    Kato, Kengo
    Shao, Xiaofeng
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2023,
  • [5] Robust factor modelling for high-dimensional time series: An application to air pollution data
    Reisen, Valderio Anselmo
    Sgrancio, Adriano Marcio
    Levy-Leduc, Celine
    Bondon, Pascal
    Monte, Edson Zambon
    Aranda Cotta, Higor Henrique
    Ziegelmann, Flavio Augusto
    [J]. APPLIED MATHEMATICS AND COMPUTATION, 2019, 346 : 842 - 852
  • [6] Bayesian modelling for spatially misaligned health and air pollution data through the INLA-SPDE approach
    Cameletti, Michela
    Gomez-Rubio, Virgilio
    Blangiardo, Marta
    [J]. SPATIAL STATISTICS, 2019, 31
  • [7] Testing the Mean Matrix in High-Dimensional Transposable Data
    Touloumis, Anestis
    Tavare, Simon
    Marioni, John C.
    [J]. BIOMETRICS, 2015, 71 (01) : 157 - 166
  • [8] On eigenvalues of a high-dimensional spatial-sign covariance matrix
    Li, Weiming
    Wang, Qinwen
    Yao, Jianfeng
    Zhou, Wang
    [J]. BERNOULLI, 2022, 28 (01) : 606 - 637
  • [9] Approximate retrieval of high-dimensional data by spatial indexing
    Shinohara, T
    An, JY
    Ishizaka, H
    [J]. DISCOVERY SCIENCE, 1998, 1532 : 141 - 149
  • [10] A High-Dimensional Video Sequence Completion Method with Traffic Data Completion Generative Adversarial Networks
    Wu, Lan
    Gao, Tian
    Wen, Chenglin
    Zhang, Kunpeng
    Kong, Fanshi
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021