Bayesian analysis of two-part nonlinear latent variable model: Semiparametric method

被引:3
|
作者
Gou, Jian-Wei [1 ]
Xia, Ye-Mao [1 ]
Jiang, De-Peng [2 ]
机构
[1] Nanjing Forestry Univ, Sch Sci, Dept Appl Math, Nanjing 210037, Jiangsu, Peoples R China
[2] Univ Manitoba, Dept Community Hlth Sci, Winnipeg, MB, Canada
关键词
Markov Chains Monte Carlo; Semi-parametric Bayesian methods; semi-continuous data; truncated Dirichlet process; two-part nonlinear latent variable model; FINITE MIXTURES; DIRICHLET; COCAINE; TRAIT; DISTRIBUTIONS; TUTORIAL;
D O I
10.1177/1471082X211059233
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Two-part model (TPM) is a widely appreciated statistical method for analyzing semi-continuous data. Semi-continuous data can be viewed as arising from two distinct stochastic processes: one governs the occurrence or binary part of data and the other determines the intensity or continuous part. In the regression setting with the semi-continuous outcome as functions of covariates, the binary part is commonly modelled via logistic regression and the continuous component via a log-normal model. The conventional TPM, still imposes assumptions such as log-normal distribution of the continuous part, with no unobserved heterogeneity among the response, and no collinearity among covariates, which are quite often unrealistic in practical applications. In this article, we develop a two-part nonlinear latent variable model (TPNLVM) with mixed multiple semi-continuous and continuous variables. The semi-continuous variables are treated as indicators of the latent factor analysis along with other manifest variables. This reduces the dimensionality of the regression model and alleviates the potential multicollinearity problems. Our TPNLVM can accommodate the nonlinear relationships among latent variables extracted from the factor analysis. To downweight the influence of distribution deviations and extreme observations, we develop a Bayesian semiparametric analysis procedure. The conventional parametric assumptions on the related distributions are relaxed and the Dirichlet process (DP) prior is used to improve model fitting. By taking advantage of the discreteness of DP, our method is effective in capturing the heterogeneity underlying population. Within the Bayesian paradigm, posterior inferences including parameters estimates and model assessment are carried out through Markov Chains Monte Carlo (MCMC) sampling method. To facilitate posterior sampling, we adapt the Polya-Gamma stochastic representation for the logistic model. Using simulation studies, we examine properties and merits of our proposed methods and illustrate our approach by evaluating the effect of treatment on cocaine use and examining whether the treatment effect is moderated by psychiatric problems.
引用
收藏
页码:376 / 399
页数:24
相关论文
共 50 条
  • [21] Nonparametric Bayesian functional two-part random effects model for longitudinal semicontinuous data analysis
    Park, Jinsu
    Choi, Taeryon
    Chung, Yeonseung
    BIOMETRICAL JOURNAL, 2021, 63 (04) : 787 - 805
  • [22] Bayesian joint analysis using a semiparametric latent variable model with non-ignorable missing covariates for CHNS data
    Ma, Zhihua
    Chen, Guanghui
    STATISTICAL MODELLING, 2021, 21 (04) : 313 - 331
  • [23] A Bayesian two-part quantile regression model for count data with excess zeros
    King, Clay
    Song, Joon Jin
    STATISTICAL MODELLING, 2019, 19 (06) : 653 - 673
  • [24] Bayesian estimation for a semiparametric nonlinear volatility model
    Hu, Shuowen
    Poskitt, D. S.
    Zhang, Xibin
    ECONOMIC MODELLING, 2021, 98 : 361 - 370
  • [25] Bayesian two-part modeling of phytoplankton biomass and occurrence
    Crispin M. Mutshinda
    Aditya Mishra
    Zoe V. Finkel
    Claire E. Widdicombe
    Andrew J. Irwin
    Hydrobiologia, 2022, 849 : 1287 - 1300
  • [26] Internalizing and Externalizing Problem Behavior: a Test of a Latent Variable Interaction Predicting a Two-Part Growth Model of Adolescent Substance Use
    Colder, Craig R.
    Frndak, Seth
    Lengua, Liliana J.
    Read, Jennifer P.
    Hawk, Larry W., Jr.
    Wieczorek, William F.
    JOURNAL OF ABNORMAL CHILD PSYCHOLOGY, 2018, 46 (02) : 319 - 330
  • [27] Bayesian two-part modeling of phytoplankton biomass and occurrence
    Mutshinda, Crispin M.
    Mishra, Aditya
    Finkel, Zoe, V
    Widdicombe, Claire E.
    Irwin, Andrew J.
    HYDROBIOLOGIA, 2022, 849 (05) : 1287 - 1300
  • [28] Internalizing and Externalizing Problem Behavior: a Test of a Latent Variable Interaction Predicting a Two-Part Growth Model of Adolescent Substance Use
    Craig R. Colder
    Seth Frndak
    Liliana J. Lengua
    Jennifer P. Read
    Larry W. Hawk
    William F. Wieczorek
    Journal of Abnormal Child Psychology, 2018, 46 : 319 - 330
  • [29] An example of a two-part latent growth curve model for semicontinuous outcomes in the health sciences
    McPherson, Sterling
    Barbosa-Leiker, Celestina
    JOURNAL OF APPLIED STATISTICS, 2012, 39 (10) : 2113 - 2128
  • [30] A Bayesian Two-Part Latent Class Model for Longitudinal Medical Expenditure Data: Assessing the Impact of Mental Health and Substance Abuse Parity
    Neelon, Brian
    O'Malley, A. James
    Normand, Sharon-Lise T.
    BIOMETRICS, 2011, 67 (01) : 280 - 289