Multivariate generalized linear mixed models for underdispersed count data

被引:0
|
作者
da Silva, Guilherme Parreira [1 ]
Laureano, Henrique Aparecido [1 ]
Petterle, Ricardo Rasmussen [2 ]
Ribeiro Jr, Paulo Justiniano [1 ]
Bonat, Wagner Hugo [1 ]
机构
[1] Univ Fed Parana, Dept Stat, Lab Stat & Geoinformat, Curitiba, Brazil
[2] Univ Fed Parana, Dept Integrat Med, Curitiba, Brazil
关键词
Regression models; automatic differentiation; multivariate models; template model builder; optimization; Laplace approximation; REGRESSION-MODELS; AUTOMATIC DIFFERENTIATION;
D O I
10.1080/00949655.2023.2184474
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Researchers are often interested in understanding the relationship between a set of covariates and a set of response variables. To achieve this goal, the use of regression analysis, either linear or generalized linear models, is largely applied. However, such models only allow users to model one response variable at a time. Moreover, it is not possible to directly calculate from the regression model a correlation measure between the response variables. In this article, we employed the Multivariate Generalized Linear Mixed Models framework, which allows the specification of a set of response variables and calculates the correlation between them through a random effect structure that follows a multivariate normal distribution. We used the maximum likelihood estimation framework to estimate all model parameters using Laplace approximation to integrate out the random effects. The derivatives are provided by automatic differentiation. The outer maximization was made using a general-purpose algorithm such as PORT and Broyden-Fletcher-Goldfarb-Shanno algorithm (BFGS). We delimited this problem by studying count response variables with the following distributions: Poisson, negative binomial, Conway-Maxwell-Poisson (COM-Poisson), and double Poisson. While the first distribution can model only equidispersed data, the second models equi and overdispersed, and the third and fourth models all types of dispersion (i.e. including underdispersion). The models were implemented on software R with package TMB, based on C++ templates. Besides the full specification, models with simpler structures in the covariance matrix were considered (fixed and common variance, and rho set to 0) and fixed dispersion. These models were applied to a dataset from the National Health and Nutrition Examination Survey, where two response variables are underdispersed and one can be considered equidispersed that were measured at 1281 subjects. The double Poisson full model specification overcame the other three competitors considering three goodness-of-fit measures: Akaike Information Criteria (AIC), Bayesian Information Criteria (BIC), and maximized log-likelihood. Consequently, it estimated parameters with smaller standard error and a greater number of significant correlation coefficients. Therefore, the proposed model can deal with multivariate count responses and measures the correlation between them taking into account the effects of the covariates.
引用
收藏
页码:2410 / 2427
页数:18
相关论文
共 50 条
  • [21] Heritability estimation and differential analysis of count data with generalized linear mixed models in genomic sequencing studies
    Sun, Shiquan
    Zhu, Jiaqiang
    Mozaffari, Sahar
    Ober, Carole
    Chen, Mengjie
    Zhou, Xiang
    [J]. BIOINFORMATICS, 2019, 35 (03) : 487 - 496
  • [22] Component-Based Regularization of Multivariate Generalized Linear Mixed Models
    Chauvet, Jocelyn
    Trottier, Catherine
    Bry, Xavier
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2019, 28 (04) : 909 - 920
  • [23] Multivariate generalized linear mixed models for continuous bounded outcomes: Analyzing the body fat percentage data
    Petterle, Ricardo R.
    Laureano, Henrique A.
    da Silva, Guilherme P.
    Bonat, Wagner H.
    [J]. STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (12) : 2619 - 2633
  • [24] Generalized linear mixed models for strawberry inflorescence data
    Cole, DJ
    Morgan, BJT
    Ridout, MS
    [J]. STATISTICAL MODELLING, 2003, 3 (04) : 273 - 290
  • [25] Generalized linear latent models for multivariate longitudinal measurements mixed with hidden Markov models
    Xia, Ye-Mao
    Tang, Nian-Sheng
    Gou, Jian-Wei
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2016, 152 : 259 - 275
  • [26] Multivariate models for correlated count data
    Rodrigues-Motta, Mariana
    Pinheiro, Hildete P.
    Martins, Eduardo G.
    Araujo, Marcio S.
    dos Reis, Sergio F.
    [J]. JOURNAL OF APPLIED STATISTICS, 2013, 40 (07) : 1586 - 1596
  • [27] Factor models for multivariate count data
    Wedel, M
    Böckenholt, U
    Kamakura, WA
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2003, 87 (02) : 356 - 369
  • [28] Regression Models for Multivariate Count Data
    Zhang, Yiwen
    Zhou, Hua
    Zhou, Jin
    Sun, Wei
    [J]. JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2017, 26 (01) : 1 - 13
  • [29] Splitting models for multivariate count data
    Peyhardi, Jean
    Fernique, Pierre
    Durand, Jean-Baptiste
    [J]. JOURNAL OF MULTIVARIATE ANALYSIS, 2021, 181
  • [30] Bivariate negative binomial generalized linear models for environmental count data
    Iwasaki, Masakazu
    Tsubaki, Hiroe
    [J]. JOURNAL OF APPLIED STATISTICS, 2006, 33 (09) : 909 - 923