High-Order Stochastic Gradient Thermostats for Bayesian Learning of Deep Models

被引:0
|
作者
Li, Chunyuan [1 ]
Chen, Changyou [1 ]
Fan, Kai [2 ]
Carin, Lawrence [1 ]
机构
[1] Duke Univ, Dept Elect & Comp Engn, Durham, NC 27706 USA
[2] Duke Univ, Computat Biol & Bioinformat, Durham, NC 27706 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning in deep models using Bayesian methods has generated significant attention recently. This is largely because of the feasibility of modern Bayesian methods to yield scalable learning and inference, while maintaining a measure of uncertainty in the model parameters. Stochastic gradient MCMC algorithms (SG-MCMC) are a family of diffusion-based sampling methods for large-scale Bayesian learning. In SG-MCMC, multivariate stochastic gradient thermostats (mSGNHT) augment each parameter of interest, with a momentum and a thermostat variable to maintain stationary distributions as target posterior distributions. As the number of variables in a continuous-time diffusion increases, its numerical approximation error becomes a practical bottleneck, so better use of a numerical integrator is desirable. To this end, we propose use of an efficient symmetric splitting integrator in mSGNHT, instead of the traditional Euler integrator. We demonstrate that the proposed scheme is more accurate, robust, and converges faster. These properties are demonstrated to be desirable in Bayesian deep learning. Extensive experiments on two canonical models and their deep extensions demonstrate that the proposed scheme improves general Bayesian posterior sampling, particularly for deep models.
引用
收藏
页码:1795 / 1801
页数:7
相关论文
共 50 条
  • [21] A deep learning method for solving high-order nonlinear soliton equations
    Cui, Shikun
    Wang, Zhen
    Han, Jiaqi
    Cui, Xinyu
    Meng, Qicheng
    [J]. COMMUNICATIONS IN THEORETICAL PHYSICS, 2022, 74 (07)
  • [22] A deep learning method for solving high-order nonlinear soliton equations
    Shikun Cui
    Zhen Wang
    Jiaqi Han
    Xinyu Cui
    Qicheng Meng
    [J]. Communications in Theoretical Physics., 2022, 74 (07) - 73
  • [23] Deep Learning for High-Order Drug-Drug Interaction Prediction
    Peng, Bo
    Ning, Xia
    [J]. ACM-BCB'19: PROCEEDINGS OF THE 10TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND HEALTH INFORMATICS, 2019, : 197 - 206
  • [24] High-Order Distance-Based Multiview Stochastic Learning in Image Classification
    Yu, Jun
    Rui, Yong
    Tang, Yuan Yan
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (12) : 2431 - 2442
  • [25] Estimation and Selection for High-Order Markov Chains with Bayesian Mixture Transition Distribution Models
    Heiner, Matthew
    Kottas, Athanasios
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2022, 31 (01) : 100 - 112
  • [26] Bayesian Distillation of Deep Learning Models
    Grabovoy, A. V.
    Strijov, V. V.
    [J]. AUTOMATION AND REMOTE CONTROL, 2021, 82 (11) : 1846 - 1856
  • [27] Deep high-order supervised hashing
    Cheng, Jing Dong
    Sun, Qiu Le
    Zhang, Jian Xin
    Desrosiers, Christian
    Liu, Bin
    Lu, Jian
    Zhang, Qiang
    [J]. OPTIK, 2019, 180 : 847 - 857
  • [28] Bayesian Distillation of Deep Learning Models
    A. V. Grabovoy
    V. V. Strijov
    [J]. Automation and Remote Control, 2021, 82 : 1846 - 1856
  • [29] Stochastic Gradient Push for Distributed Deep Learning
    Assran, Mahmoud
    Loizou, Nicolas
    Ballas, Nicolas
    Rabbat, Michael
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [30] Stochastic high-order Hopfield neural networks
    Shen, Y
    Zhao, GY
    Jiang, MH
    Hu, SG
    [J]. ADVANCES IN NATURAL COMPUTATION, PT 1, PROCEEDINGS, 2005, 3610 : 740 - 749