Latent variable models for the topographic organisation of discrete and strictly positive data

被引:9
|
作者
Girolami, M [1 ]
机构
[1] Univ Paisley, Dept Comp & Informat Syst, Sch Informat & Commun Technol, Appl Computat Intelligence Res Unit, Paisley PA1 2BE, Renfrew, Scotland
关键词
generative models; topographic mappings; non-negative matrix factorisation; latent semantic analysis;
D O I
10.1016/S0925-2312(01)00659-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper is concerned with learning dense low-dimensional representations of high-dimensional positive data. The positive data may be continuous, discrete binary or count based. In addition to the low-dimensional data model, a topographic ordering of the representation is desired. The primary motivation for this work is the requirement for a low-dimensional interpretation of sparse vector space models of text documents which may take the form of binary, count based or real multivariate data. The generative topographic mapping (GTM) was developed and introduced as a principled alternative to the self-organising map for, principally, visualising high-dimensional continuous data. The GTM is one method by which a topographically organised low-dimensional data representation may be realised. There are many cases where the observation data is discrete and the application of methods developed specifically for continuous data is inappropriate. Based on the continuous GTM data model a non-linear latent variable model for modelling high-dimensional binary data is presented. The non-negative factorisation of a positive matrix which ensures a topographic ordering of the constituent factors is also presented as a principled yet non-probabilistic alternative to the GTM model. Experimental demonstrations of both methods are provided based on representing binary coded handwritten digits and the topographic organisation and visualisation of a collection of text based documents. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:185 / 198
页数:14
相关论文
共 50 条
  • [41] Some Remarks on Latent Variable Models in Categorical Data Analysis
    Agresti, Alan
    Kateri, Maria
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2014, 43 (04) : 801 - 814
  • [42] Simulated latent variable estimation of models with ordered categorical data
    Breslaw, JA
    McIntosh, J
    [J]. JOURNAL OF ECONOMETRICS, 1998, 87 (01) : 25 - 47
  • [43] Latent variable models for longitudinal data with multiple continuous outcomes
    Roy, J
    Lin, XH
    [J]. BIOMETRICS, 2000, 56 (04) : 1047 - 1054
  • [44] Bayesian latent variable models for the analysis of experimental psychology data
    Edgar C. Merkle
    Ting Wang
    [J]. Psychonomic Bulletin & Review, 2018, 25 : 256 - 270
  • [45] Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models
    Minervini, Pasquale
    Franceschi, Luca
    Niepert, Mathias
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 9200 - 9208
  • [46] The integration of continuous and discrete latent variable models: Potential problems and promising opportunities
    Bauer, DJ
    Curran, PJ
    [J]. PSYCHOLOGICAL METHODS, 2004, 9 (01) : 3 - 29
  • [47] Tempered expectation-maximization algorithm for the estimation of discrete latent variable models
    Brusa, Luca
    Bartolucci, Francesco
    Pennoni, Fulvia
    [J]. COMPUTATIONAL STATISTICS, 2023, 38 (03) : 1391 - 1424
  • [48] Maximum likelihood estimation for discrete latent variable models via evolutionary algorithms
    Brusa, Luca
    Pennoni, Fulvia
    Bartolucci, Francesco
    [J]. STATISTICS AND COMPUTING, 2024, 34 (02)
  • [49] Joint Stochastic Approximation and Its Application to Learning Discrete Latent Variable Models
    Ou, Zhijian
    Song, Yunfu
    [J]. CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 929 - 938
  • [50] Hidden Markov Latent Variable Models with Multivariate Longitudinal Data
    Song, Xinyuan
    Xia, Yemao
    Zhu, Hongtu
    [J]. BIOMETRICS, 2017, 73 (01) : 313 - 323