An algorithm for distributed Bayesian inference

被引:4
|
作者
Shyamalkumar, Nariankadu D. [1 ]
Srivastava, Sanvesh [1 ]
机构
[1] Univ Iowa, Dept Stat & Actuarial Sci, Iowa City, IA 52242 USA
来源
STAT | 2022年 / 11卷 / 01期
基金
美国国家科学基金会;
关键词
data augmentation; distributed computing; divide-and-conquer; location-scatter family; Monte Carlo computations; Wasserstein distance; BARYCENTERS; MODELS;
D O I
10.1002/sta4.432
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Monte Carlo algorithms, such as Markov chain Monte Carlo (MCMC) and Hamiltonian Monte Carlo (HMC), are routinely used for Bayesian inference; however, these algorithms are prohibitively slow in massive data settings because they require multiple passes through the full data in every iteration. Addressing this problem, we develop a scalable extension of these algorithms using the divide-and-conquer (D&C) technique that divides the data into a sufficiently large number of subsets, draws parameters in parallel on the subsets using a powered likelihood and produces Monte Carlo draws of the parameter by combining parameter draws obtained from each subset. The combined parameter draws play the role of draws from the original sampling algorithm. Our main contributions are twofold. First, we demonstrate through diverse simulated and real data analyses focusing on generalized linear models (GLMs) that our distributed algorithm delivers comparable results as the current state-of-the-art D&C algorithms in terms of statistical accuracy and computational efficiency. Second, providing theoretical support for our empirical observations, we identify regularity assumptions under which the proposed algorithm leads to asymptotically optimal inference. We also provide illustrative examples focusing on normal linear and logistic regressions where parts of our D&C algorithm are analytically tractable.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] A distributed learning algorithm for Bayesian inference networks
    Lam, W
    Segre, AM
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2002, 14 (01) : 93 - 105
  • [2] DISTRIBUTED INFERENCE IN BAYESIAN NETWORKS
    DIEZ, FJ
    MIRA, J
    CYBERNETICS AND SYSTEMS, 1994, 25 (01) : 39 - 61
  • [3] Simulation algorithm for Bayesian network inference
    Hu, Zhao-Yong
    Qu, Liang-Sheng
    Xitong Fangzhen Xuebao / Journal of System Simulation, 2004, 16 (02):
  • [4] Approximate algorithm for Bayesian network inference
    Han Wei
    Ji Qiong
    Proceedings of 2005 Chinese Control and Decision Conference, Vols 1 and 2, 2005, : 1176 - 1180
  • [5] An optimal approximation algorithm for Bayesian inference
    Dagum, P
    Luby, M
    ARTIFICIAL INTELLIGENCE, 1997, 93 (1-2) : 1 - 27
  • [6] Optimal approximation algorithm for Bayesian inference
    Stanford Univ Sch of Medicine, Stanford, United States
    Artif Intell, 1-2 (1-27):
  • [7] Node aggregation for distributed inference in bayesian networks
    1600, Morgan Kaufmann Publ Inc, San Mateo, CA, USA (01):
  • [8] Distributed Bayesian Inference Over Sensor Networks
    Ye, Baijia
    Qin, Jiahu
    Fu, Weiming
    Zhu, Yingda
    Wang, Yaonan
    Kang, Yu
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (03) : 1587 - 1597
  • [9] Streaming, Distributed Variational Inference for Bayesian Nonparametrics
    Campbell, Trevor
    Straub, Julian
    Fisher, John W., III
    How, Jonathan P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [10] Distributed Bayesian Inference in Massive Spatial Data
    Guhaniyogi, Rajarshi
    Li, Cheng
    Savitsky, Terrance
    Srivastava, Sanvesh
    STATISTICAL SCIENCE, 2023, 38 (02) : 262 - 284