FINDING A FLEXIBLE HOT-DECK IMPUTATION METHOD FOR MULTINOMIAL DATA

被引:5
|
作者
Andridge, Rebecca [1 ]
Bechtel, Laura [2 ]
Thompson, Katherine Jenny [2 ]
机构
[1] Ohio State Univ, 1841 Neil Ave, Columbus, OH 43210 USA
[2] US Census Bur, 4600 Silver Hill Rd, Washington, DC 20233 USA
关键词
MULTIPLE IMPUTATION;
D O I
10.1093/jssam/smaa005
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
Detailed breakdowns on totals are often collected in surveys, such as a breakdown of total product sales by product type. These multinomial data are often sparsely reported with wide variability in proportions across units. In addition, there are often true zeros that differ across units even within industry; for example, one establishment sells jeans but not shoes, and another sells shoes but not socks. It is quite common to have large fractions of missing data for these detailed items, even when totals are relatively completely observed. Hot-deck imputation, which fills in missing data with observed data values, is an attractive approach. The entire set of proportions can be simultaneously imputed to preserve multinomial distributions, and zero values can be imputed. However, it is not clear what variant of the hot deck is best. We describe a large set of "flavors" of the hot deck and compare them through simulation and by application to data from the 2012 Economic Census. We consider different ways to create the donor pool: choosing one nearest neighbor (NN), choosing from five NNs, or using all units as the donor pool. We also consider different ways to impute from the donor: directly impute the donor's vector of proportions or randomly draw from a multinomial distribution using this vector of proportions. We consider scenarios where a strong predictor of these multinomial distributions exists as well as when covariate information is weak.
引用
收藏
页码:789 / 809
页数:21
相关论文
共 50 条
  • [21] Hot Deck Multiple Imputation for Handling Missing Accelerometer Data
    Nicole M. Butera
    Siying Li
    Kelly R. Evenson
    Chongzhi Di
    David M. Buchner
    Michael J. LaMonte
    Andrea Z. LaCroix
    Amy Herring
    [J]. Statistics in Biosciences, 2019, 11 : 422 - 448
  • [22] Regression fractional hot deck imputation
    Kim, Jae Kwang
    [J]. JOURNAL OF THE KOREAN STATISTICAL SOCIETY, 2007, 36 (03) : 423 - 434
  • [23] Using the Fractional Imputation Methodology to Evaluate Variance due to Hot Deck Imputation in Survey Data
    Perez, Adriana
    [J]. JOURNAL OF MODERN APPLIED STATISTICAL METHODS, 2007, 6 (01) : 248 - 257
  • [24] A Comparison of Hot Deck Imputation and Substitution Methods in The Estimation of Missing Data
    Yesilova, Abdullah
    Kaya, Yilmaz
    Almali, M. Nuri
    [J]. GAZI UNIVERSITY JOURNAL OF SCIENCE, 2011, 24 (01): : 69 - 75
  • [25] Can earnings equations estimates improve CPS hot-deck imputations?
    John A. Bishop
    John P. Formby
    Paul D. Thistle
    [J]. Journal of Labor Research, 2003, 24 : 153 - 159
  • [26] 配比设计中缺失数据的hot-deck估算
    任金马
    赵杨
    陈峰
    蓝绍颖
    [J]. 中国卫生统计, 2004, (05) : 48 - 51
  • [27] Predictive Mean Matching as an alternative imputation method to hot deck in Vigitel
    Santana dos Santos, Iolanda Karla
    Conde, Wolney Lisboa
    [J]. CADERNOS DE SAUDE PUBLICA, 2020, 36 (06):
  • [28] A hot deck imputation procedure for multiply imputing nonignorable missing data: The proxy pattern-mixture hot deck
    Sullivan, Danielle
    Andridge, Rebecca
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2015, 82 : 173 - 185
  • [29] Can earnings equations estimates improve CPS hot-deck imputations?
    Bishop, JA
    Formby, JP
    Thistle, PD
    [J]. JOURNAL OF LABOR RESEARCH, 2003, 24 (01) : 153 - 159
  • [30] The Use of Sample Weights in Hot Deck Imputation
    Andridge, Rebecca R.
    Little, Roderick J.
    [J]. JOURNAL OF OFFICIAL STATISTICS, 2009, 25 (01) : 21 - 36