The topographic organization and visualization of binary data using multivariate-bernoulli latent variable models

被引:14
|
作者
Girolami, M [1 ]
机构
[1] Univ Paisley, Div Comp & Informat Syst, Appl Computat Intelligence Res Unit, Paisley PA1 2BE, Renfrew, Scotland
来源
IEEE TRANSACTIONS ON NEURAL NETWORKS | 2001年 / 12卷 / 06期
关键词
data clustering; data mining; data visualization; generative modeling; probabilistic modeling; self-organization; text document processing; unsupervised learning;
D O I
10.1109/72.963773
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A nonlinear latent variable model for the topographic organization and subsequent visualization of multivariate binary data is presented. The generative topographic mapping (GTM) is a nonlinear factor analysis model for continuous data which assumes an isotropic Gaussian noise model and performs uniform sampling from a two-dimensional (2-D) latent space. Despite the success of the GTM when applied to continuous data the development of a similar model for discrete binary data has been hindered due, in part, to the nonlinear link function inherent in the binomial distribution which yields a log-likelihood that is nonlinear in the model parameters. This paper presents an effective method for the parameter estimation of a binary latent variable model-a binary version of the GTM-by adopting a variational approximation to the binomial likelihood. This approximation thus provides a log-likelihood which is quadratic in the model parameters and so obviates the necessity of an iterative M-step in the expectation maximization (EM) algorithm. The power of this method is demonstrated on two significant application domains, handwritten digit recognition and the topographic organization of semantically similar text-based documents.
引用
收藏
页码:1367 / 1374
页数:8
相关论文
共 50 条
  • [41] Semiparametric Bayesian latent variable regression for skewed multivariate data
    Bhingare, Apurva
    Sinha, Debajyoti
    Pati, Debdeep
    Bandyopadhyay, Dipankar
    Lipsitz, Stuart R.
    BIOMETRICS, 2019, 75 (02) : 528 - 538
  • [42] Nonlinear latent curve models for multivariate longitudinal data
    Blozis, Shelley A.
    Conger, Katherine J.
    Harring, Jeffrey R.
    INTERNATIONAL JOURNAL OF BEHAVIORAL DEVELOPMENT, 2007, 31 (04) : 340 - 346
  • [43] Learning Binary Latent Variable Models: A Tensor Eigenpair Approach
    Jaffe, Ariel
    Weiss, Roi
    Carmi, Shai
    Kluger, Yuval
    Nadler, Boaz
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [44] LATENT VARIABLE MODELS FOR MULTIVARIATE DYADIC DATA WITH ZERO INFLATION: ANALYSIS OF INTERGENERATIONAL EXCHANGES OF FAMILY SUPPORT
    Kuha, Jouni
    Zhang, Siliang
    Steele, Fiona
    ANNALS OF APPLIED STATISTICS, 2023, 17 (02): : 1521 - 1542
  • [45] Multivariate probit linear mixed models for multivariate longitudinal binary data
    Lee, Kuo-Jung
    Kim, Chanmin
    Yoo, Jae Keun
    Lee, Keunbaik
    STATISTICS IN MEDICINE, 2024, 43 (08) : 1527 - 1548
  • [46] Exploratory data analysis using radial basis function latent variable models
    Marrs, AD
    Webb, AR
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11, 1999, 11 : 529 - 535
  • [47] Bayesian latent variable models for spatially correlated tooth-level binary data in caries research
    Zhang, Y.
    Todem, D.
    Kim, K.
    Lesaffre, E.
    STATISTICAL MODELLING, 2011, 11 (01) : 25 - 47
  • [48] Latent Differential Equation Models for Binary and Ordinal Data
    Hu, Yueqin
    Boker, Steven M.
    STRUCTURAL EQUATION MODELING-A MULTIDISCIPLINARY JOURNAL, 2017, 24 (01) : 52 - 64
  • [49] Latent Gaussian copula models for longitudinal binary data
    Peng, Cheng
    Yang, Yihe
    Zhou, Jie
    Pan, Jianxin
    JOURNAL OF MULTIVARIATE ANALYSIS, 2022, 189
  • [50] Bayesian latent factor regression for multivariate functional data with variable selection
    Noh, Heesang
    Choi, Taeryon
    Park, Jinsu
    Chung, Yeonseung
    JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2020, 49 (03) : 901 - 923