High-Order Stochastic Gradient Thermostats for Bayesian Learning of Deep Models

被引:0
|
作者
Li, Chunyuan [1 ]
Chen, Changyou [1 ]
Fan, Kai [2 ]
Carin, Lawrence [1 ]
机构
[1] Duke Univ, Dept Elect & Comp Engn, Durham, NC 27706 USA
[2] Duke Univ, Computat Biol & Bioinformat, Durham, NC 27706 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning in deep models using Bayesian methods has generated significant attention recently. This is largely because of the feasibility of modern Bayesian methods to yield scalable learning and inference, while maintaining a measure of uncertainty in the model parameters. Stochastic gradient MCMC algorithms (SG-MCMC) are a family of diffusion-based sampling methods for large-scale Bayesian learning. In SG-MCMC, multivariate stochastic gradient thermostats (mSGNHT) augment each parameter of interest, with a momentum and a thermostat variable to maintain stationary distributions as target posterior distributions. As the number of variables in a continuous-time diffusion increases, its numerical approximation error becomes a practical bottleneck, so better use of a numerical integrator is desirable. To this end, we propose use of an efficient symmetric splitting integrator in mSGNHT, instead of the traditional Euler integrator. We demonstrate that the proposed scheme is more accurate, robust, and converges faster. These properties are demonstrated to be desirable in Bayesian deep learning. Extensive experiments on two canonical models and their deep extensions demonstrate that the proposed scheme improves general Bayesian posterior sampling, particularly for deep models.
引用
收藏
页码:1795 / 1801
页数:7
相关论文
共 50 条
  • [1] Bayesian Sampling Using Stochastic Gradient Thermostats
    Ding, Nan
    Fang, Youhan
    Babbush, Ryan
    Chen, Changyou
    Skeel, Robert D.
    Neven, Hartmut
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27
  • [2] Stabilization of High-Order Stochastic Gradient Adaptive Filtering Algorithms
    Eweda, Eweda
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2017, 65 (15) : 3948 - 3959
  • [3] On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators
    Chen, Changyou
    Ding, Nan
    Carin, Lawrence
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
  • [4] High-order discretization schemes for stochastic volatility models
    Jourdain, Benjamin
    Sbai, Mohamed
    [J]. JOURNAL OF COMPUTATIONAL FINANCE, 2013, 17 (02) : 113 - 165
  • [5] Learning Deep Generative Models With Doubly Stochastic Gradient MCMC
    Du, Chao
    Zhu, Jun
    Zhang, Bo
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) : 3084 - 3096
  • [6] Compressing Parameters in Bayesian High-order Models with Application to Logistic Sequence Models
    Li, Longhai
    Neal, Radford M.
    [J]. BAYESIAN ANALYSIS, 2008, 3 (04): : 793 - 821
  • [7] Evaluating High-Order Predictive Distributions in Deep Learning
    Osband, Ian
    Wen, Zheng
    Asghari, Seyed Mohammad
    Dwaracherla, Vikranth
    Lu, Xiuyuan
    Van Roy, Benjamin
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 1552 - 1560
  • [8] First order versus high-order stochastic models for computer intrusion detection
    Ye, N
    Ehiabor, T
    Zhang, YB
    [J]. QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2002, 18 (03) : 243 - 250
  • [9] On iterative learning control with high-order internal models
    Liu, Chunping
    Xu, Jianxin
    Wu, Jun
    [J]. INTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, 2010, 24 (09) : 731 - 742
  • [10] On Iterative Learning Control with High-Order Internal Models
    Liu, Chunping
    Xu, Jianxin
    Wu, Jun
    Tan, Ying
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON CONTROL AND AUTOMATION, VOLS 1-3, 2009, : 1565 - +