Bayesian Multiscale Multiple Imputation With Implications for Data Confidentiality

被引:23
|
作者
Holan, Scott H. [1 ]
Toth, Daniell [2 ]
Ferreira, Marco A. R. [1 ]
Karr, Alan F. [3 ]
机构
[1] Univ Missouri, Dept Stat, Columbia, MO 65211 USA
[2] US Bur Labor Stat, Off Survey Methods Res, Washington, DC 20212 USA
[3] Natl Inst Stat Sci, Res Triangle Pk, NC 27709 USA
基金
美国国家科学基金会;
关键词
Cell suppression; Disclosure; Dynamic linear models; Missing data; Multiscale modeling; QCEW; DISCLOSURE LIMITATION; CELL SUPPRESSION; TABULAR DATA; MICRODATA; MODELS; FRAMEWORK; PRIVACY; SYSTEMS; UTILITY; WORLD;
D O I
10.1198/jasa.2009.ap08629
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Many scientific, sociological, and economic applications present data that are collected on multiple scales of resolution. One particular form of multiscale data arises when data are aggregated across different scales both longitudinally and by economic sector. Frequently, such datasets experience missing observations in a manner that they can be accurately imputed, while respecting the constraints imposed by the multiscale nature of the data, using the method we propose known as Bayesian multiscale multiple imputation. Our approach couples dynamic linear models with a novel imputation step based on singular normal distribution theory. Although our method is of independent interest, one important implication of such methodology is its potential effect on confidential databases protected by means of cell suppression. In order to demonstrate the proposed methodology and to assess the effectiveness of disclosure practices in longitudinal databases, we conduct a large-scale empirical study using the U.S. Bureau of Labor Statistics Quarterly Census of Employment and Wages (QCEW). During the course of our empirical investigation it is determined that several of the predicted cells are within 1% accuracy, thus causing potential concerns for data confidentiality.
引用
收藏
页码:564 / 577
页数:14
相关论文
共 50 条
  • [1] Multiple imputation for longitudinal data using Bayesian lasso imputation model
    Yamaguchi, Yusuke
    Yoshida, Satoshi
    Misumi, Toshihiro
    Maruo, Kazushi
    STATISTICS IN MEDICINE, 2022, 41 (06) : 1042 - 1058
  • [2] Bayesian multiple imputation for assay data subject to measurement error
    Guo Y.
    Little R.J.
    Journal of Statistical Theory and Practice, 2013, 7 (2) : 219 - 232
  • [3] Multiple Imputation for Longitudinal Data Under a Bayesian Multilevel Model
    Demirtas, Hakan
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2009, 38 (16-17) : 2812 - 2828
  • [4] Bayesian Latent Class Models for the Multiple Imputation of Categorical Data
    Vidotto, Davide
    Vermunt, Jeroen K.
    Van Deun, Katrijn
    METHODOLOGY-EUROPEAN JOURNAL OF RESEARCH METHODS FOR THE BEHAVIORAL AND SOCIAL SCIENCES, 2018, 14 (02) : 56 - 68
  • [5] BAYESIAN IMPUTATION FOR MISSING DATA
    Nads, Azman A.
    Polestico, Daisy Lou L.
    ADVANCES AND APPLICATIONS IN STATISTICS, 2022, 79 : 83 - 104
  • [6] Bayesian nonparametric multiple imputation of partially observed data with ignorable nonresponse
    Paddock, SM
    BIOMETRIKA, 2002, 89 (03) : 529 - 538
  • [7] A Bayesian multiple imputation approach to bivariate functional data with missing components
    Jang, Jeong Hoon
    Manatunga, Amita K.
    Chang, Changgee
    Long, Qi
    STATISTICS IN MEDICINE, 2021, 40 (22) : 4772 - 4793
  • [8] Bayesian multiple imputation for large-scale categorical data with structural zeros
    Manrique-Vallier, Daniel
    Reiter, Jerome P.
    SURVEY METHODOLOGY, 2014, 40 (01) : 125 - 134
  • [9] Bayesian Multilevel Latent Class Models for the Multiple Imputation of Nested Categorical Data
    Vidotto, Davide
    Vermunt, Jeroen K.
    van Deun, Katrijn
    JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS, 2018, 43 (05) : 511 - 539
  • [10] Noise correction using Bayesian multiple imputation
    Van Hulse, Jason
    Khoshgoftaar, Taghi M.
    Seiffert, Chris
    Zhao, Lili
    IRI 2006: PROCEEDINGS OF THE 2006 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2006, : 478 - +