How to Pretrain Deep Boltzmann Machines in Two Stages

被引:0
|
作者
Cho, Kyunghyun [1 ]
Raiko, Tapani [1 ]
Ilin, Alexander [1 ]
Karhunen, Juha [1 ]
机构
[1] Aalto Univ, Sch Sci, Dept Informat & Comp Sci, Espoo, Finland
来源
关键词
ALGORITHM; GRADIENT;
D O I
10.1007/978-3-319-09903-3_10
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A deep Boltzmann machine (DBM) is a recently introduced Markov random field model that has multiple layers of hidden units. It has been shown empirically that it is difficult to train a DBM with approximate maximum-likelihood learning using the stochastic gradient unlike its simpler special case, restricted Boltzmann machine (RBM). In this paper, we propose a novel pretraining algorithm that consists of two stages; obtaining approximate posterior distributions over hidden units from a simpler model and maximizing the variational lower-bound given the fixed hidden posterior distributions. We show empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm.
引用
收藏
页码:201 / 219
页数:19
相关论文
共 50 条
  • [1] How to Center Deep Boltzmann Machines
    Melchior, Jan
    Fischer, Asja
    Wiskott, Laurenz
    JOURNAL OF MACHINE LEARNING RESEARCH, 2016, 17
  • [2] How to center deep Boltzmann machines
    Melchior, Jan
    Fischer, Asja
    Wiskott, Laurenz
    Journal of Machine Learning Research, 2016, 17 : 1 - 61
  • [3] A Two-Stage Pretraining Algorithm for Deep Boltzmann Machines
    Cho, KyungHyun
    Raiko, Tapani
    Ilin, Alexander
    Karhunen, Juha
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2013, 2013, 8131 : 106 - 113
  • [4] Compression by and for Deep Boltzmann Machines
    Li, Qing
    Chen, Yang
    Kim, Yongjune
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2020, 68 (12) : 7498 - 7510
  • [5] How to Pretrain for Steganalysis
    Butora, Jan
    Yousfi, Yassine
    Fridrich, Jessica
    PROCEEDINGS OF THE 2021 ACM WORKSHOP ON INFORMATION HIDING AND MULTIMEDIA SECURITY, IH&MMSEC 2021, 2021, : 143 - 148
  • [6] Multimodal Learning with Deep Boltzmann Machines
    Srivastava, Nitish
    Salakhutdinov, Ruslan
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 2949 - 2980
  • [7] Multimodal learning with Deep Boltzmann Machines
    Srivastava, Nitish
    Salakhutdinov, Ruslan
    Journal of Machine Learning Research, 2014, 15 : 2949 - 2980
  • [8] Denoising Deep Boltzmann Machines: Compression for Deep Learning
    Li, Qing
    Chen, Yang
    2020 DATA COMPRESSION CONFERENCE (DCC 2020), 2020, : 303 - 312
  • [9] Temperature-Based Deep Boltzmann Machines
    Leandro Aparecido Passos
    João Paulo Papa
    Neural Processing Letters, 2018, 48 : 95 - 107
  • [10] Deep Boltzmann Machines Using Adaptive Temperatures
    Passos Junior, Leandro A.
    Costa, Kelton A. P.
    Papa, Joao P.
    COMPUTER ANALYSIS OF IMAGES AND PATTERNS, 2017, 10424 : 172 - 183