Learning Deep Generative Models With Doubly Stochastic Gradient MCMC

被引：8

作者：

Du, Chao ^{[1
]}

Zhu, Jun ^{[1
]}

Zhang, Bo ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci & Technol, State Key Lab Intelligent Technol & Syst, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2018年 / 29卷 / 07期

关键词：

Bayesian methods; deep generative models (DGMs); deep learning; Markov chain Monte Carlo (MCMC); stochastic gradient; BACKPROPAGATION; INFERENCE; ALGORITHM; NETWORKS;

D O I：

10.1109/TNNLS.2017.2688499

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep generative models (DGMs), which are often organized in a hierarchical manner, provide a principled framework of capturing the underlying causal factors of data. Recent work on DGMs focussed on the development of efficient and scalable variational inference methods that learn a single model under some mean-field or parameterization assumptions. However, little work has been done on extending Markov chain Monte Carlo (MCMC) methods to Bayesian DGMs, which enjoy many advantages compared with variational methods. We present doubly stochastic gradient MCMC, a simple and generic method for (approximate) Bayesian inference of DGMs in a collapsed continuous parameter space. At each MCMC sampling step, the algorithm randomly draws a mini-batch of data samples to estimate the gradient of log-posterior and further estimates the intractable expectation over hidden variables via a neural adaptive importance sampler, where the proposal distribution is parameterized by a deep neural network and learnt jointly along with the sampling process. We demonstrate the effectiveness of learning various DGMs on a wide range of tasks, including density estimation, data generation, and missing data imputation. Our method outperforms many state-of-the-art competitors.

引用

页码：3084 / 3096

页数：13

共 50 条

[1] Stochastic Gradient MCMC for State Space Models
Aicher, Christopher
Ma, Yi-An
Foti, Nicholas J.
Fox, Emily B.
[J]. SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2019, 1 (03): : 555 - 587
[2] Stochastic Gradient MCMC Methods for Hidden Markov Models
Ma, Yi-An
Foti, Nicholas J.
Fox, Emily B.
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[3] Learning Deep Generative Models
Salakhutdinov, Ruslan
[J]. ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 2, 2015, 2 : 361 - 385
[4] Learning Weight Uncertainty with Stochastic Gradient MCMC for Shape Classification
Li, Chunyuan
Stevens, Andrew
Chen, Changyou
Pu, Yunchen
Gan, Zhe
Carin, Lawrence
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 5666 - 5675
[5] Multimodal deep generative adversarial models for scalable doubly semi-supervised learning
Du, Changde
Du, Changying
He, Huiguang
[J]. INFORMATION FUSION, 2021, 68 : 118 - 130
[6] Structured Stochastic Gradient MCMC
Alexos, Antonios
Boyd, Alex
Mandt, Stephan
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 414 - 434
[7] Distributed Stochastic Gradient MCMC
Ahn, Sungjin
Shahbaba, Babak
Welling, Max
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 1044 - 1052
[8] Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models
Teng, Yunfei
Gao, Wenbo
Chalus, Francois
Choromanska, Anna
Goldfarb, Donald
Weller, Adrian
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[9] Bayesian sparse learning with preconditioned stochastic gradient MCMC and its applications
Wang, Yating
Deng, Wei
Lin, Guang
[J]. JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 432
[10] πVAE: a stochastic process prior for Bayesian deep learning with MCMC
Mishra, Swapnil
Flaxman, Seth
Berah, Tresnia
Zhu, Harrison
Pakkanen, Mikko
Bhatt, Samir
[J]. STATISTICS AND COMPUTING, 2022, 32 (06)

← 1 2 3 4 5 →