Generalized Identifiability Bounds for Mixture Models With Grouped Samples

被引:0
|
作者
Vandermeulen, Robert A. [1 ]
Saitenmacher, Rene [2 ,3 ]
机构
[1] Tech Univ Berlin, Berlin Inst Fdn, Learning & Data & Machine Learning Grp, D-10623 Berlin, Germany
[2] Tech Univ Berlin, Machine Learning Grp, D-10587 Berlin, Germany
[3] Weierstrass Inst Appl Anal & Stochast, D-10117 Berlin, Germany
关键词
Nonparametric statistics; identifiability; nonparametric mixture models; tensor factorization; topic modeling; multinomial mixture model; NONPARAMETRIC-ESTIMATION; DECOMPOSITIONS; DISTRIBUTIONS; UNIQUENESS; INFERENCE; TENSORS;
D O I
10.1109/TIT.2024.3367433
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent work has shown that finite mixture models with m components are identifiable, while making no assumptions on the mixture components, so long as one has access to groups of samples of size 2m-1 which are known to come from the same mixture component. In this work we generalize that result and show that, if every subset of k mixture components of a mixture model are linearly independent, then that mixture model is identifiable with only (2m-1)/(k-1) samples per group. We further show that this value cannot be improved. We prove an analogous result for a stronger form of identifiability known as "determinedness" along with a corresponding lower bound. This independence assumption almost surely holds if mixture components are chosen randomly from a k-dimensional space. We describe some implications of our results for multinomial mixture models and topic modeling.
引用
收藏
页码:2746 / 2758
页数:13
相关论文
共 50 条
  • [1] Identifiability of Large Phylogenetic Mixture Models
    John A. Rhodes
    Seth Sullivant
    [J]. Bulletin of Mathematical Biology, 2012, 74 : 212 - 231
  • [2] Empirical identifiability in finite mixture models
    Daeyoung Kim
    Bruce G. Lindsay
    [J]. Annals of the Institute of Statistical Mathematics, 2015, 67 : 745 - 772
  • [3] Empirical identifiability in finite mixture models
    Kim, Daeyoung
    Lindsay, Bruce G.
    [J]. ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2015, 67 (04) : 745 - 772
  • [4] Identifiability of Large Phylogenetic Mixture Models
    Rhodes, John A.
    Sullivant, Seth
    [J]. BULLETIN OF MATHEMATICAL BIOLOGY, 2012, 74 (01) : 212 - 231
  • [5] Identifiability constraints in generalized additive models
    Stringer, Alex
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2024, 52 (02): : 461 - 476
  • [6] Identifiability of Generalized Randles Circuit Models
    Alavi, Seyed Mohammad Mahdi
    Mahdi, Adam
    Payne, Stephen J.
    Howey, David A.
    [J]. IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2017, 25 (06) : 2112 - 2120
  • [7] UNIQUENESS OF ESTIMATION AND IDENTIFIABILITY IN MIXTURE-MODELS
    LINDSAY, BG
    ROEDER, K
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1993, 21 (02): : 139 - 147
  • [8] Bayesian approach for mixture models with grouped data
    Shiow-Lan Gau
    Jean de Dieu Tapsoba
    Shen-Ming Lee
    [J]. Computational Statistics, 2014, 29 : 1025 - 1043
  • [9] Bayesian approach for mixture models with grouped data
    Gau, Shiow-Lan
    Tapsoba, Jean de Dieu
    Lee, Shen-Ming
    [J]. COMPUTATIONAL STATISTICS, 2014, 29 (05) : 1025 - 1043
  • [10] Generalized fused Lasso for grouped data in generalized linear models
    Ohishi, Mineaki
    [J]. STATISTICS AND COMPUTING, 2024, 34 (04)