Mixed-membership naive Bayes models

被引：20

作者：

Shan, Hanhuai ^{[1
]}

Banerjee, Arindam ^{[1
]}

机构：

[1] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA

来源：

DATA MINING AND KNOWLEDGE DISCOVERY | 2011年 / 23卷 / 01期

基金：

美国国家科学基金会; 美国国家航空航天局;

关键词：

Naive Bayes; Latent Dirichlet allocation; Mixed-membership; Generative models; Variational inference; Logistic regression; MAXIMUM-LIKELIHOOD;

D O I：

10.1007/s10618-010-0198-2

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, mixture models have found widespread usage in discovering latent cluster structure from data. A popular special case of finite mixture models is the family of naive Bayes (NB) models, where the probability of a feature vector factorizes over the features for any given component of the mixture. Despite their popularity, naive Bayes models do not allow data points to belong to different component clusters with varying degrees, i.e., mixed memberships, which puts a restriction on their modeling ability. In this paper, we propose mixed-membership naive Bayes (MMNB) models. On one hand, MMNB can be viewed as a generalization of NB by putting a Dirichlet prior on top to allow mixed memberships. On the other hand, MMNB can also be viewed as a generalization of latent Dirichlet allocation (LDA) with the ability to handle heterogeneous feature vectors with different types of features, e.g., real, categorical, etc.. We propose two variational inference algorithms to learn MMNB models. The first one is based on ideas originally used in LDA, and the second one uses substantially fewer variational parameters, leading to a significantly faster algorithm. Further, we extend MMNB/LDA to discriminative mixed-membership models for classification by suitably combining MMNB/LDA with multi-class logistic regression. The efficacy of the proposed mixed-membership models is demonstrated by extensive experiments on several datasets, including UCI benchmarks, recommendation systems, and text datasets.

引用

页码：1 / 62

页数：62

共 50 条

[1] Mixed-membership naive Bayes models
Hanhuai Shan
Arindam Banerjee
[J]. Data Mining and Knowledge Discovery, 2011, 23 : 1 - 62
[2] Discriminative Mixed-membership Models
Shan, Hanhuai
Banerjee, Arindam
Oza, Nikunj C.
[J]. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 466 - +
[3] Mixed-membership models of scientific publications
Erosheva, E
Fienberg, S
Lafferty, J
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 : 5220 - 5227
[4] Mixed-Membership Stochastic Block Models for Weighted Networks
Dulac, A.
Gaussier, W.
Largeron, C.
[J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 679 - 688
[5] Typology of Mixed-Membership Models: Towards a Design Method
Heinrich, Gregor
[J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2011, 6912 : 32 - 47
[6] The Dataminer's Guide to Scalable Mixed-Membership and Nonparametric Bayesian Models
Ahmed, Amr
Smola, Alex
[J]. 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 1528 - 1528
[7] Mixed-membership of experts stochastic blockmodel
White, Arthur
Murphy, Thomas Brendan
[J]. NETWORK SCIENCE, 2016, 4 (01) : 48 - 80
[8] The Dataminer's Guide to Scalable Mixed-Membership and Nonparametric Bayesian Models
Ahmed, Amr
Smola, Alex
[J]. 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 1527 - 1527
[9] Accurate and scalable social recommendation using mixed-membership stochastic block models
Godoy-Lorite, Antonia
Guimera, Roger
Moore, Cristopher
Sales-Pardo, Marta
[J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (50) : 14207 - 14212
[10] Dynamic Infinite Mixed-Membership Stochastic Blockmodel
Fan, Xuhui
Cao, Longbing
Xu, Richard Yi Da
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (09) : 2072 - 2085

← 1 2 3 4 5 →