A Cyclic Contrastive Divergence Learning Algorithm for High-order RBMs

被引：3

作者：

Luo, Dingsheng ^{[1
,2
]}

Wang, Yi ^{[1
,2
]}

Han, Xiaoqiang ^{[1
,2
]}

Wu, Xihong ^{[1
,2
]}

机构：

[1] Peking Univ, Minist Educ, Key Lab Machine Percept, Beijing 100871, Peoples R China

[2] Peking Univ, Speech & Hearing Res Ctr, Beijing 100871, Peoples R China

来源：

2014 13TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA) | 2014年

关键词：

High-order RBMs; Cyclic Contrastive Divergence Learning; Gradient Approximation; Convergence; Upper Bound; CONVERGENCE;

D O I：

10.1109/ICMLA.2014.18

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Restricted Boltzmann Machine (RBM), a special case of general Boltzmann Machines and a typical Probabilistic Graphical Models, has attracted much attention in recent years due to its powerful ability in extracting features and representing the distribution underlying the training data. A most commonly used algorithm in learning RBMs is called Contrastive Divergence (CD) proposed by Hinton, which starts a Markov chain at a data point and runs the chain for only a few iterations to get a low variance estimator. However, when referring to a high-order RBM, since there are interactions among its visible layers, the gradient approximation via CD learning usually becomes far from the log-likelihood gradient and even may cause CD learning to fall into an infinite loop with high reconstruction error. In this paper, a new algorithm named Cyclic Contrastive Divergence (CCD) is introduced for learning high-order RBMs. Unlike the standard CD algorithm, CCD updates the parameters according to each visible layer in turn, by borrowing the idea of Cyclic Block Coordinate Descent method. To evaluate the performance of the proposed CCD algorithm, regarding to high-order RBMs learning, both algorithms CCD and standard CD are theoretically analyzed, including convergence, estimate upper bound and both biases comparison, from which the superiority of CCD learning is revealed. Experiments on MNIST dataset for the handwritten digit classification task are performed. The experimental results show that CCD is more applicable and consistently outperforms the standard CD in both convergent speed and performance.

引用

页码：80 / 86

页数：7

共 50 条

[1] Graph Clustering with High-Order Contrastive Learning
Li, Wang
Zhu, En
Wang, Siwei
Guo, Xifeng
[J]. ENTROPY, 2023, 25 (10)
[2] A high-order Recursive Quadratic learning algorithm
Zhu, Q
Tan, SH
Qiao, Y
[J]. COMPUTATIONAL SCIENCE - ICCS 2005, PT 1, PROCEEDINGS, 2005, 3514 : 90 - 98
[3] DIVERGENCE OF HIGH-ORDER GAUSSIAN MODES
BRIDGES, WB
[J]. APPLIED OPTICS, 1975, 14 (10): : 2346 - 2347
[4] High-Order Feedback Iterative Learning Control Algorithm with Forgetting Factor
Wang, Hongbin
Dong, Jian
Wang, Yueling
[J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2015, 2015
[5] High-order parameter-optimization iterative learning control algorithm
Pang, Bo
Shao, Cheng
[J]. Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2015, 32 (04): : 561 - 567
[6] ALGORITHM FOR HIGH-ORDER POLYNOMIAL EXTRAPOLATION
PARKER, R
[J]. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN ENGINEERING, 1979, 14 (09) : 1428 - 1429
[7] Adiabatic Persistent Contrastive Divergence Learning
Jang, Hyeryung
Choi, Hyungwon
Yi, Yung
Shin, Jinwoo
[J]. 2017 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2017,
[8] Bounding the Bias of Contrastive Divergence Learning
Fischer, Asja
Igel, Christian
[J]. NEURAL COMPUTATION, 2011, 23 (03) : 664 - 673
[9] A New Adaptive Iterative Learning Algorithm Based on a High-Order Internal Model
Zhang, Guoshan
Li, Siqi
[J]. Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2022, 55 (05): : 480 - 488
[10] Divergence of High-Order Harmonic Generation by a Convex Plasma Surface
Yang, Chun
Zhou, Chuliang
Zheng, Yinghui
Zhang, Dongdong
Gao, Jixing
Bai, Yafeng
Qi, Rong
Qian, Jiayi
Gui, Jiayan
Zhang, Zongxin
Tian, Ye
Zeng, Zhinan
[J]. APPLIED SCIENCES-BASEL, 2022, 12 (11):

← 1 2 3 4 5 →