Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables

被引：158

作者：

Chickering, DM ^{[1
]}

Heckerman, D ^{[1
]}

机构：

[1] Microsoft Corp, Redmond, WA 98052 USA

来源：

MACHINE LEARNING | 1997年 / 29卷 / 2-3期

关键词：

Bayesian model averaging; model selection; multinomial mixtures; clustering; unsupervised learning; Laplace approximation;

D O I：

10.1023/A:1007469629108

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We discuss Bayesian methods for model averaging and model selection among Bayesian-network models with hidden variables. In particular, we examine large-sample approximations for the marginal likelihood of naive-Bayes models in which the root node is hidden. Such models are useful for clustering or unsupervised learning. lire consider a Laplace approximation and the less accurate but more computationally efficient approximation known as the Bayesian Information Criterion (BIG), which is equivalent to Rissanen's (1987) Minimum Description Length (MDL). Also, we consider approximations that ignore some off-diagonal elements of the observed information matrix and an approximation proposed by Cheeseman and Stutz (1995). We evaluate the accuracy of these approximations using a Monte-Carlo gold standard. In experiments with artificial and real examples, we find that (1) none of the approximations are accurate when used for model averaging, (2) all of the approximations, with the exception of BIC/MDL, are accurate for model selection, (3) among the accurate approximations, the Cheeseman-Stutz and Diagonal approximations are the most computationally efficient, (4) all of the approximations, with the exception of BIC/MDL, can be sensitive to the prior distribution over model parameters, and (5) the Cheeseman-Stutz approximation can be more accurate than the other approximations, including the Laplace approximation, in situations where the parameters in the maximum a posteriori configuration are near a boundary.

引用

页码：181 / 212

页数：32

共 50 条

[1] Efficient Approximations for the Marginal Likelihood of Bayesian Networks with Hidden Variables
David Maxwell Chickering
David Heckerman
[J]. Machine Learning, 1997, 29 : 181 - 212
[2] Efficient approximations for the marginal likelihood of incomplete data given a Bayesian network
Chickering, DM
Heckerman, D
[J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 1996, : 158 - 168
[3] Geometry, moments and Bayesian networks with hidden variables
Settimi, R
Smith, JQ
[J]. ARTIFICIAL INTELLIGENCE AND STATISTICS 99, PROCEEDINGS, 1999, : 293 - 298
[4] Incremental learning of Bayesian networks with hidden variables
Tian, FZ
Zhang, HW
Lu, YC
Shi, CY
[J]. 2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 651 - 652
[5] Parameter Identifiability of Discrete Bayesian Networks with Hidden Variables
Allman, Elizabeth S.
Rhodes, John A.
Stanghellini, Elena
Valtorta, Marco
[J]. JOURNAL OF CAUSAL INFERENCE, 2015, 3 (02) : 189 - 205
[6] Learning Bayesian networks with hidden variables for user modeling
Wittig, F
[J]. UM99: USER MODELING, PROCEEDINGS, 1999, (407): : 343 - 344
[7] Data clustering using hidden variables in hybrid Bayesian networks
Fernández A.
Gámez J.A.
Rumí R.
Salmerón A.
[J]. Progress in Artificial Intelligence, 2014, 2 (2-3) : 141 - 152
[8] A FRAMEWORK FOR BAYESIAN AND LIKELIHOOD APPROXIMATIONS IN STATISTICS
SWEETING, TJ
[J]. BIOMETRIKA, 1995, 82 (01) : 1 - 23
[9] Using Bayesian networks with hidden variables for identifying trustworthy users in social networks
Chen, Xu
Yuan, Yuyu
Orgun, Mehmet Ali
[J]. JOURNAL OF INFORMATION SCIENCE, 2020, 46 (05) : 600 - 615
[10] Marginal Structured SVM with Hidden Variables
Ping, Wei
Liu, Qiang
Ihler, Alexander
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 190 - 198

← 1 2 3 4 5 →