Mixed-membership naive Bayes models

被引:20
|
作者
Shan, Hanhuai [1 ]
Banerjee, Arindam [1 ]
机构
[1] Univ Minnesota, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
基金
美国国家科学基金会; 美国国家航空航天局;
关键词
Naive Bayes; Latent Dirichlet allocation; Mixed-membership; Generative models; Variational inference; Logistic regression; MAXIMUM-LIKELIHOOD;
D O I
10.1007/s10618-010-0198-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, mixture models have found widespread usage in discovering latent cluster structure from data. A popular special case of finite mixture models is the family of naive Bayes (NB) models, where the probability of a feature vector factorizes over the features for any given component of the mixture. Despite their popularity, naive Bayes models do not allow data points to belong to different component clusters with varying degrees, i.e., mixed memberships, which puts a restriction on their modeling ability. In this paper, we propose mixed-membership naive Bayes (MMNB) models. On one hand, MMNB can be viewed as a generalization of NB by putting a Dirichlet prior on top to allow mixed memberships. On the other hand, MMNB can also be viewed as a generalization of latent Dirichlet allocation (LDA) with the ability to handle heterogeneous feature vectors with different types of features, e.g., real, categorical, etc.. We propose two variational inference algorithms to learn MMNB models. The first one is based on ideas originally used in LDA, and the second one uses substantially fewer variational parameters, leading to a significantly faster algorithm. Further, we extend MMNB/LDA to discriminative mixed-membership models for classification by suitably combining MMNB/LDA with multi-class logistic regression. The efficacy of the proposed mixed-membership models is demonstrated by extensive experiments on several datasets, including UCI benchmarks, recommendation systems, and text datasets.
引用
收藏
页码:1 / 62
页数:62
相关论文
共 50 条
  • [1] Mixed-membership naive Bayes models
    Hanhuai Shan
    Arindam Banerjee
    [J]. Data Mining and Knowledge Discovery, 2011, 23 : 1 - 62
  • [2] Discriminative Mixed-membership Models
    Shan, Hanhuai
    Banerjee, Arindam
    Oza, Nikunj C.
    [J]. 2009 9TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2009, : 466 - +
  • [3] Mixed-membership models of scientific publications
    Erosheva, E
    Fienberg, S
    Lafferty, J
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 : 5220 - 5227
  • [4] Mixed-Membership Stochastic Block Models for Weighted Networks
    Dulac, A.
    Gaussier, W.
    Largeron, C.
    [J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 679 - 688
  • [5] Typology of Mixed-Membership Models: Towards a Design Method
    Heinrich, Gregor
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PT II, 2011, 6912 : 32 - 47
  • [6] The Dataminer's Guide to Scalable Mixed-Membership and Nonparametric Bayesian Models
    Ahmed, Amr
    Smola, Alex
    [J]. 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 1528 - 1528
  • [7] Mixed-membership of experts stochastic blockmodel
    White, Arthur
    Murphy, Thomas Brendan
    [J]. NETWORK SCIENCE, 2016, 4 (01) : 48 - 80
  • [8] The Dataminer's Guide to Scalable Mixed-Membership and Nonparametric Bayesian Models
    Ahmed, Amr
    Smola, Alex
    [J]. 19TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'13), 2013, : 1527 - 1527
  • [9] Accurate and scalable social recommendation using mixed-membership stochastic block models
    Godoy-Lorite, Antonia
    Guimera, Roger
    Moore, Cristopher
    Sales-Pardo, Marta
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (50) : 14207 - 14212
  • [10] Dynamic Infinite Mixed-Membership Stochastic Blockmodel
    Fan, Xuhui
    Cao, Longbing
    Xu, Richard Yi Da
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (09) : 2072 - 2085