Computationally efficient nonstationary nearest-neighbor Gaussian process models using data-driven techniques

被引:3
|
作者
Konomi, B. A. [1 ]
Hanandeh, A. A. [2 ]
Ma, P. [3 ,4 ]
Kang, E. L. [1 ]
机构
[1] Univ Cincinnati, Dept Math Sci, Div Stat & Data Sci, Cincinnati, OH 45221 USA
[2] Yarmouk Univ, Dept Stat, Irbid, Jordan
[3] Stat & Appl Math Sci Inst, Durham, NC USA
[4] Duke Univ, Dept Stat Sci, Durham, NC USA
基金
美国国家科学基金会;
关键词
Bayesian hierarchical modeling; binary tree; large data sets; Markov chain Monte Carlo (MCMC); nonstationary covariance function; TOMS ozone data; RANDOM-FIELDS; BAYESIAN-INFERENCE; LIKELIHOOD;
D O I
10.1002/env.2571
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Due to the increased availability of measurements of various geophysical processes, a need has arisen for statistical methods suitable for the analysis of very large nonstationary spatial data sets. The nearest-neighbor Gaussian process (NNGP) models are one of the latest and most popular Gaussian process-based models, which reduce computational complexity and memory storage. The Bayesian inference is based on the assumption of a parametric covariance function that is often assumed stationary or known. Given that NNGP models are sensitive in the stationary assumption in comparison to other reduction methods, there is a need to build nonstationary covariance functions within the NNGP models. However, the construction of a nonstationary covariance function and/or matrix may be computationally expensive by itself in the presence of big data. In this paper, we develop an efficient two-stage approach that deals with nonstationarity and the computational complexity in the presence of a big spatial data set. We propose a new low-cost data-driven tree-structured partitioning technique to divide the spatial region into distinct subregions. Given the partitions, we construct computationally efficient nonstationary covariance functions for NNGP models. We demonstrate the performance of our approach through simulation experiments and an application to the global Total Ozone Matrix Spectrometer (TOMS) data set, in which the proposed approach performs well in terms of both prediction accuracy and computational complexity.
引用
收藏
页数:20
相关论文
共 50 条
  • [1] On nearest-neighbor Gaussian process models for massive spatial data
    Datta, Abhirup
    Banerjee, Sudipto
    Finley, Andrew O.
    Gelfand, Alan E.
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2016, 8 (05): : 162 - 171
  • [2] Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets
    Datta, Abhirup
    Banerjee, Sudipto
    Finley, Andrew O.
    Gelfand, Alan E.
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (514) : 800 - 812
  • [3] Efficient Nearest-Neighbor Data Sharing in GPUs
    Nematollahi, Negin
    Sadrosadati, Mohammad
    Falahati, Hajar
    Barkhordar, Marzieh
    Drumond, Mario Paulo
    Sarbazi-Azad, Hamid
    Falsafi, Babak
    ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2021, 18 (01)
  • [4] Computationally Efficient Techniques for Data-Driven Haptic Rendering
    Hoever, Raphael
    Di Luca, Massimiliano
    Szekely, Gabor
    Harders, Matthias
    WORLD HAPTICS 2009: THIRD JOINT EUROHAPTICS CONFERENCE AND SYMPOSIUM ON HAPTIC INTERFACES FOR VIRTUAL ENVIRONMENT AND TELEOPERATOR SYSTEMS, PROCEEDINGS, 2009, : 39 - +
  • [5] Nearest-Neighbor Mixture Models for Non-Gaussian Spatial Processes
    Zheng, Xiaotian
    Kottas, Athanasios
    Sanso, Bruno
    BAYESIAN ANALYSIS, 2023, 18 (04): : 1191 - 1222
  • [6] On Nonstationary Gaussian Process Model for Solving Data-Driven Optimization Problems
    Hu, Caie
    Zeng, Sanyou
    Li, Changhe
    Zhao, Fei
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (04) : 2440 - 2453
  • [7] Improving performances of MCMC for Nearest Neighbor Gaussian Process models with full data augmentation
    Coube-Sisqueille, Sebastien
    Liquet, Benoit
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2022, 168
  • [8] spNNGP R Package for Nearest Neighbor Gaussian Process Models
    Finley, Andrew O.
    Datta, Abhirup
    Banerjee, Sudipto
    JOURNAL OF STATISTICAL SOFTWARE, 2022, 103 (05): : 1 - 40
  • [9] Analysis of Tandem Repeat Protein Folding Using Nearest-Neighbor Models
    Petersen, Mark
    Barrick, Doug
    ANNUAL REVIEW OF BIOPHYSICS, VOL 50, 2021, 2021, 50 : 245 - 265
  • [10] HYPERSPECTRAL CLASSIFICATION USING A COMPOSITE KERNEL DRIVEN BY NEAREST-NEIGHBOR SPATIAL FEATURES
    Menon, Vineetha
    Prasad, Saurabh
    Fowler, James E.
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 2100 - 2104