Learning Deep Generative Models With Doubly Stochastic Gradient MCMC

被引:8
|
作者
Du, Chao [1 ]
Zhu, Jun [1 ]
Zhang, Bo [1 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, State Key Lab Intelligent Technol & Syst, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China
关键词
Bayesian methods; deep generative models (DGMs); deep learning; Markov chain Monte Carlo (MCMC); stochastic gradient; BACKPROPAGATION; INFERENCE; ALGORITHM; NETWORKS;
D O I
10.1109/TNNLS.2017.2688499
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep generative models (DGMs), which are often organized in a hierarchical manner, provide a principled framework of capturing the underlying causal factors of data. Recent work on DGMs focussed on the development of efficient and scalable variational inference methods that learn a single model under some mean-field or parameterization assumptions. However, little work has been done on extending Markov chain Monte Carlo (MCMC) methods to Bayesian DGMs, which enjoy many advantages compared with variational methods. We present doubly stochastic gradient MCMC, a simple and generic method for (approximate) Bayesian inference of DGMs in a collapsed continuous parameter space. At each MCMC sampling step, the algorithm randomly draws a mini-batch of data samples to estimate the gradient of log-posterior and further estimates the intractable expectation over hidden variables via a neural adaptive importance sampler, where the proposal distribution is parameterized by a deep neural network and learnt jointly along with the sampling process. We demonstrate the effectiveness of learning various DGMs on a wide range of tasks, including density estimation, data generation, and missing data imputation. Our method outperforms many state-of-the-art competitors.
引用
收藏
页码:3084 / 3096
页数:13
相关论文
共 50 条
  • [1] Stochastic Gradient MCMC for State Space Models
    Aicher, Christopher
    Ma, Yi-An
    Foti, Nicholas J.
    Fox, Emily B.
    [J]. SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (03): : 555 - 587
  • [2] Stochastic Gradient MCMC Methods for Hidden Markov Models
    Ma, Yi-An
    Foti, Nicholas J.
    Fox, Emily B.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [3] Learning Deep Generative Models
    Salakhutdinov, Ruslan
    [J]. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 2, 2015, 2 : 361 - 385
  • [4] Learning Weight Uncertainty with Stochastic Gradient MCMC for Shape Classification
    Li, Chunyuan
    Stevens, Andrew
    Chen, Changyou
    Pu, Yunchen
    Gan, Zhe
    Carin, Lawrence
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5666 - 5675
  • [5] Multimodal deep generative adversarial models for scalable doubly semi-supervised learning
    Du, Changde
    Du, Changying
    He, Huiguang
    [J]. INFORMATION FUSION, 2021, 68 : 118 - 130
  • [6] Structured Stochastic Gradient MCMC
    Alexos, Antonios
    Boyd, Alex
    Mandt, Stephan
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 414 - 434
  • [7] Distributed Stochastic Gradient MCMC
    Ahn, Sungjin
    Shahbaba, Babak
    Welling, Max
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 1044 - 1052
  • [8] Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models
    Teng, Yunfei
    Gao, Wenbo
    Chalus, Francois
    Choromanska, Anna
    Goldfarb, Donald
    Weller, Adrian
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [9] Bayesian sparse learning with preconditioned stochastic gradient MCMC and its applications
    Wang, Yating
    Deng, Wei
    Lin, Guang
    [J]. JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 432
  • [10] πVAE: a stochastic process prior for Bayesian deep learning with MCMC
    Mishra, Swapnil
    Flaxman, Seth
    Berah, Tresnia
    Zhu, Harrison
    Pakkanen, Mikko
    Bhatt, Samir
    [J]. STATISTICS AND COMPUTING, 2022, 32 (06)