Assessing Bayesian Semi-Parametric Log-Linear Models: An Application to Disclosure Risk Estimation

被引:0
|
作者
Carota, Cinzia [1 ]
Filippone, Maurizio [2 ]
Polettini, Silvia [3 ]
机构
[1] Univ Torino, Dipartimento Econ & Stat Cognetti de Martiis, Lungo Dora Siena 100 A, I-10153 Turin, Italy
[2] EURECOM Campus SophiaTech, Dept Data Sci, 450 Route Chappes, F-06410 Biot, France
[3] Sapienza Univ Roma, Dipartimento Sci Sociali Econ, Ple Aldo Moro 5, I-00185 Rome, Italy
关键词
Bayesian model selection; Dirichlet process random effects; Disclosure risk; Log-linear mixed models; Model's predictive performance; Selection-induced bias; Statistical disclosure limitation; SELECTION; EXAMPLES;
D O I
10.1111/insr.12471
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose a method for identifying models with good predictive performance in the family of Bayesian log-linear mixed models with Dirichlet process random effects for count data. Their wide applicability makes the assessment of model performance crucial in many fields, including disclosure risk estimation, which is the focus of the present work. Rather than assessing models on the whole contingency table, we target the specific objective of the analysis and propose a two-stage model selection procedure aimed at limiting a form of bias arising in the process of model selection. Our proposal combines two different criteria: at the first stage, a path in the model search space is identified through a strongly penalized log-likelihood; at the second, a small number of semi-parametric models is evaluated through a context-dependent score-based information criterion. Tested on a variety of contingency tables, our method proves to be able to identify models with good predictive performance in a few steps, even in the presence of large tables with many sampling and structural zeros. We carefully discuss the proposed method in the context of the literature on model assessment and contextualize the illustrative application in the recent debate on statistical disclosure limitation. Finally, we provide examples of further applications in different research areas.
引用
收藏
页码:165 / 183
页数:19
相关论文
共 50 条
  • [1] BAYESIAN NONPARAMETRIC DISCLOSURE RISK ESTIMATION VIA MIXED EFFECTS LOG-LINEAR MODELS
    Carota, Cinzia
    Filippone, Maurizio
    Leombruni, Roberto
    Polettini, Silvia
    [J]. ANNALS OF APPLIED STATISTICS, 2015, 9 (01): : 525 - 546
  • [2] Semi-parametric PWP model robustness for log-linear increasing rates of occurrence of failures
    Landers, TL
    Jiang, ST
    Peek, JR
    [J]. RELIABILITY ENGINEERING & SYSTEM SAFETY, 2001, 73 (02) : 145 - 153
  • [3] A semi-parametric Bayesian approach to generalized linear mixed models
    Kleinman, KP
    Ibrahim, JG
    [J]. STATISTICS IN MEDICINE, 1998, 17 (22) : 2579 - 2596
  • [4] Bayesian selection of log-linear models
    Albert, JH
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 1996, 24 (03): : 327 - 347
  • [5] Semi-parametric estimation and forecasting for exogenous log-GARCH models
    Ming Chen
    Qiongxia Song
    [J]. TEST, 2016, 25 : 93 - 112
  • [6] Semi-parametric estimation and forecasting for exogenous log-GARCH models
    Chen, Ming
    Song, Qiongxia
    [J]. TEST, 2016, 25 (01) : 93 - 112
  • [7] Estimation with improved efficiency in semi-parametric linear longitudinal models
    Warriyar, Vineetha K. V.
    Sutradhar, Brajendra C.
    [J]. BRAZILIAN JOURNAL OF PROBABILITY AND STATISTICS, 2014, 28 (04) : 561 - 586
  • [8] Semi-parametric estimation for ARCH models
    Alzghool, Raed
    Al-Zubi, Loai M.
    [J]. ALEXANDRIA ENGINEERING JOURNAL, 2018, 57 (01) : 367 - 373
  • [9] Parametric empirical Bayes estimation for a class of extended log-linear regression models
    Tu, WZ
    Piegorsch, WW
    [J]. ENVIRONMETRICS, 2000, 11 (03) : 271 - 285
  • [10] Assessing Identification Risk in Survey Microdata Using Log-Linear Models
    Skinner, Chris
    Shlomo, Natalie
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (483) : 989 - 1001