An Efficient Learning Procedure for Deep Boltzmann Machines

被引：340

作者：

Salakhutdinov, Ruslan ^{[1
]}

Hinton, Geoffrey ^{[2
]}

机构：

[1] Univ Toronto, Dept Stat, Toronto, ON M5S 3G3, Canada

[2] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 3G3, Canada

来源：

NEURAL COMPUTATION | 2012年 / 24卷 / 08期

基金：

加拿大自然科学与工程研究理事会;

关键词：

D O I：

10.1162/NECO_a_00311

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present a new learning algorithm for Boltzmann machines that contain many layers of hidden variables. Data-dependent statistics are estimated using a variational approximation that tends to focus on a single mode, and data-independent statistics are estimated using persistent Markov chains. The use of two quite different techniques for estimating the two types of statistic that enter into the gradient of the log likelihood makes it practical to learn Boltzmann machines with multiple hidden layers and millions of parameters. The learning can be made more efficient by using a layer-by-layer pretraining phase that initializes the weights sensibly. The pretraining also allows the variational inference to be initialized sensibly with a single bottom-up pass. We present results on the MNIST and NORB data sets showing that deep Boltzmann machines learn very good generative models of handwritten digits and 3D objects. We also show that the features discovered by deep Boltzmann machines are a very effective way to initialize the hidden layers of feedforward neural nets, which are then discriminatively fine-tuned.

引用

页码：1967 / 2006

页数：40

共 50 条

[1] Multimodal Learning with Deep Boltzmann Machines
Srivastava, Nitish
Salakhutdinov, Ruslan
JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 2949 - 2980
[2] Multimodal learning with Deep Boltzmann Machines
Srivastava, Nitish
Salakhutdinov, Ruslan
Journal of Machine Learning Research, 2014, 15 : 2949 - 2980
[3] Denoising Deep Boltzmann Machines: Compression for Deep Learning
Li, Qing
Chen, Yang
2020 DATA COMPRESSION CONFERENCE (DCC 2020), 2020, : 303 - 312
[4] Partitioned learning of deep Boltzmann machines for SNP data
Hess, Moritz
Lenz, Stefan
Blaette, Tamara J.
Bullinger, Lars
Binder, Harald
BIOINFORMATICS, 2017, 33 (20) : 3173 - 3180
[5] Parallel Tempering is Efficient for Learning Restricted Boltzmann Machines
Cho, KyungHyun
Raiko, Tapani
Ilin, Alexander
2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
[6] Detection of Hypertension Retinopathy Using Deep Learning and Boltzmann Machines
Triwijoyo, B. K.
Pradipto, Y. D.
1ST INTERNATIONAL CONFERENCE ON COMPUTING AND APPLIED INFORMATICS 2016 : APPLIED INFORMATICS TOWARD SMART ENVIRONMENT, PEOPLE, AND SOCIETY, 2017, 801
[7] Efficient learning in Boltzmann machines using linear response theory
Kappen, HJ
Rodriguez, FB
NEURAL COMPUTATION, 1998, 10 (05) : 1137 - 1156
[8] Efficient Learning of Restricted Boltzmann Machines Using Covariance Estimates
Upadhya, Vidyadhar
Sastry, P. S.
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 101, 2019, 101 : 851 - 866
[9] Compression by and for Deep Boltzmann Machines
Li, Qing
Chen, Yang
Kim, Yongjune
IEEE TRANSACTIONS ON COMMUNICATIONS, 2020, 68 (12) : 7498 - 7510
[10] SALIENCY DETECTION BASED ON FEATURE LEARNING USING DEEP BOLTZMANN MACHINES
Wen, Shifeng
Han, Junwei
Zhang, Dingwen
Guo, Lei
2014 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2014,

← 1 2 3 4 5 →