Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables

被引:158
|
作者
Chickering, DM [1 ]
Heckerman, D [1 ]
机构
[1] Microsoft Corp, Redmond, WA 98052 USA
关键词
Bayesian model averaging; model selection; multinomial mixtures; clustering; unsupervised learning; Laplace approximation;
D O I
10.1023/A:1007469629108
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We discuss Bayesian methods for model averaging and model selection among Bayesian-network models with hidden variables. In particular, we examine large-sample approximations for the marginal likelihood of naive-Bayes models in which the root node is hidden. Such models are useful for clustering or unsupervised learning. lire consider a Laplace approximation and the less accurate but more computationally efficient approximation known as the Bayesian Information Criterion (BIG), which is equivalent to Rissanen's (1987) Minimum Description Length (MDL). Also, we consider approximations that ignore some off-diagonal elements of the observed information matrix and an approximation proposed by Cheeseman and Stutz (1995). We evaluate the accuracy of these approximations using a Monte-Carlo gold standard. In experiments with artificial and real examples, we find that (1) none of the approximations are accurate when used for model averaging, (2) all of the approximations, with the exception of BIC/MDL, are accurate for model selection, (3) among the accurate approximations, the Cheeseman-Stutz and Diagonal approximations are the most computationally efficient, (4) all of the approximations, with the exception of BIC/MDL, can be sensitive to the prior distribution over model parameters, and (5) the Cheeseman-Stutz approximation can be more accurate than the other approximations, including the Laplace approximation, in situations where the parameters in the maximum a posteriori configuration are near a boundary.
引用
收藏
页码:181 / 212
页数:32
相关论文
共 50 条
  • [1] Efficient Approximations for the Marginal Likelihood of Bayesian Networks with Hidden Variables
    David Maxwell Chickering
    David Heckerman
    [J]. Machine Learning, 1997, 29 : 181 - 212
  • [2] Efficient approximations for the marginal likelihood of incomplete data given a Bayesian network
    Chickering, DM
    Heckerman, D
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 1996, : 158 - 168
  • [3] Geometry, moments and Bayesian networks with hidden variables
    Settimi, R
    Smith, JQ
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS 99, PROCEEDINGS, 1999, : 293 - 298
  • [4] Incremental learning of Bayesian networks with hidden variables
    Tian, FZ
    Zhang, HW
    Lu, YC
    Shi, CY
    [J]. 2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 651 - 652
  • [5] Parameter Identifiability of Discrete Bayesian Networks with Hidden Variables
    Allman, Elizabeth S.
    Rhodes, John A.
    Stanghellini, Elena
    Valtorta, Marco
    [J]. JOURNAL OF CAUSAL INFERENCE, 2015, 3 (02) : 189 - 205
  • [6] Learning Bayesian networks with hidden variables for user modeling
    Wittig, F
    [J]. UM99: USER MODELING, PROCEEDINGS, 1999, (407): : 343 - 344
  • [7] Data clustering using hidden variables in hybrid Bayesian networks
    Fernández A.
    Gámez J.A.
    Rumí R.
    Salmerón A.
    [J]. Progress in Artificial Intelligence, 2014, 2 (2-3) : 141 - 152
  • [8] A FRAMEWORK FOR BAYESIAN AND LIKELIHOOD APPROXIMATIONS IN STATISTICS
    SWEETING, TJ
    [J]. BIOMETRIKA, 1995, 82 (01) : 1 - 23
  • [9] Using Bayesian networks with hidden variables for identifying trustworthy users in social networks
    Chen, Xu
    Yuan, Yuyu
    Orgun, Mehmet Ali
    [J]. JOURNAL OF INFORMATION SCIENCE, 2020, 46 (05) : 600 - 615
  • [10] Marginal Structured SVM with Hidden Variables
    Ping, Wei
    Liu, Qiang
    Ihler, Alexander
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 32 (CYCLE 2), 2014, 32 : 190 - 198