Integrating sample similarities into latent class analysis: a tree-structured shrinkage approach

被引:3
|
作者
Li, Mengbing [1 ]
Park, Daniel E. [3 ]
Aziz, Maliha [3 ]
Liu, Cindy M. [3 ]
Price, Lance B. [3 ]
Wu, Zhenke [1 ,2 ]
机构
[1] Univ Michigan, Dept Biostat, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Michigan Inst Data Sci MIDAS, Ann Arbor, MI 48109 USA
[3] George Washington Univ, Environm & Occupat Hlth, Milken Inst, Sch Publ Hlth, Washington, DC USA
基金
英国惠康基金; 美国国家卫生研究院;
关键词
Gaussian diffusion; latent class models; phylogenetic tree; spike-and-slab prior; variational Bayes; zoonotic infectious diseases; BAYESIAN VARIABLE SELECTION; VARIATIONAL INFERENCE; MODELS; REGRESSION;
D O I
10.1111/biom.13580
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
This paper is concerned with using multivariate binary observations to estimate the probabilities of unobserved classes with scientific meanings. We focus on the setting where additional information about sample similarities is available and represented by a rooted weighted tree. Every leaf in the given tree contains multiple samples. Shorter distances over the tree between the leaves indicate a priori higher similarity in class probability vectors. We propose a novel data integrative extension to classical latent class models with tree-structured shrinkage. The proposed approach enables (1) borrowing of information across leaves, (2) estimating data-driven leaf groups with distinct vectors of class probabilities, and (3) individual-level probabilistic class assignment given the observed multivariate binary measurements. We derive and implement a scalable posterior inference algorithm in a variational Bayes framework. Extensive simulations show more accurate estimation of class probabilities than alternatives that suboptimally use the additional sample similarity information. A zoonotic infectious disease application is used to illustrate the proposed approach. The paper concludes by a brief discussion on model limitations and extensions.
引用
收藏
页码:264 / 279
页数:16
相关论文
共 50 条
  • [1] TREE-STRUCTURED SURVIVAL ANALYSIS
    GORDON, L
    OLSHEN, RA
    [J]. CANCER TREATMENT REPORTS, 1985, 69 (10): : 1065 - 1069
  • [2] Features of tree-structured survival analysis
    Segal, MR
    [J]. EPIDEMIOLOGY, 1997, 8 (04) : 344 - 346
  • [3] On subtyping of tree-structured data: A polynomial approach
    Bry, F
    Drabent, W
    Maluszynski, J
    [J]. PRINCIPLES AND PRACTICE OF SEMANTIC WEB REASONING, PROCEEDINGS, 2004, 3208 : 1 - 18
  • [4] Tree-structured analysis of survival data - Search for latent diagnostic factors in a tumour study
    Brambilla, C
    Rossi, C
    Schinaia, G
    [J]. APPLIED STOCHASTIC MODELS AND DATA ANALYSIS, 1997, 13 (3-4): : 333 - 343
  • [5] AN EXPLORATORY ANALYSIS OF SURVIVAL WITH AIDS USING A NONPARAMETRIC TREE-STRUCTURED APPROACH
    PIETTE, JD
    INTRATOR, O
    ZIERLER, S
    MOR, V
    STEIN, MD
    [J]. EPIDEMIOLOGY, 1992, 3 (04) : 310 - 318
  • [6] Analysis of Tree-Structured Architectures for Code Generation
    Dahal, Samip
    Maharana, Adyasha
    Bansal, Mohit
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4382 - 4391
  • [7] A NON-GREEDY APPROACH TO TREE-STRUCTURED CLUSTERING
    MILLER, D
    ROSE, K
    [J]. PATTERN RECOGNITION LETTERS, 1994, 15 (07) : 683 - 690
  • [8] Sentiment Analysis with Tree-Structured Gated Recurrent Units
    Kuta, Marcin
    Morawiec, Mikolaj
    Kitowski, Jacek
    [J]. TEXT, SPEECH, AND DIALOGUE, TSD 2017, 2017, 10415 : 74 - 82
  • [9] An adaptive analysis of covariance using tree-structured regression
    Gadbury, GL
    Iyer, HK
    Schreuder, HT
    [J]. JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2002, 7 (01) : 42 - 57
  • [10] An adaptive analysis of covariance using tree-structured regression
    G. L. Gadbury
    H. K. Iyer
    H. T. Schreuder
    [J]. Journal of Agricultural, Biological, and Environmental Statistics, 2002, 7 : 42 - 57