Expanding Tidy Data Principles to Facilitate Missing Data Exploration, Visualization and Assessment of Imputations

被引:47
|
作者
Tierney, Nicholas [1 ,2 ]
Cook, Dianne [1 ]
机构
[1] Monash Univ, Melbourne, Vic, Australia
[2] Telethon Kids Inst, Perth, WA, Australia
来源
JOURNAL OF STATISTICAL SOFTWARE | 2023年 / 105卷 / 07期
关键词
statistical computing; statistical graphics; data science; data visualization; tidy-verse; data pipeline; R; R-PACKAGE;
D O I
10.18637/jss.v105.i07
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Despite the large body of research on missing value distributions and imputation, there is comparatively little literature with a focus on how to make it easy to handle, explore, and impute missing values in data. This paper addresses this gap. The new methodology builds upon tidy data principles, with the goal of integrating missing value handling as a key part of data analysis workflows. We define a new data structure, and a suite of new operations. Together, these provide a connected framework for handling, exploring, and imputing missing values. These methods are available in the R package naniar.
引用
收藏
页码:1 / 31
页数:31
相关论文
共 50 条
  • [31] Alternative expectation approaches for expectation-maximization missing data imputations in cox regression
    Saglam, Fatih
    Sanli, Tuba
    Cengiz, Mehmet Ali
    Terzi, Yuksel
    [J]. COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2023, 52 (12) : 5966 - 5974
  • [32] Updating Canada's National Forest Inventory with multiple imputations of missing contemporary data
    Magnussen, Steen
    Stinson, Graham
    Boudewyn, Paul
    [J]. FORESTRY CHRONICLE, 2017, 93 (03): : 212 - 224
  • [33] Missing data exploration: highlighting graphical presentation of missing pattern
    Zhang, Zhongheng
    [J]. ANNALS OF TRANSLATIONAL MEDICINE, 2015, 3 (22)
  • [34] Big data visualization for in-situ data exploration for sportsperson
    Li, Wenya
    Karthik, C.
    Rajalakshmi, M.
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2022, 99
  • [35] SUSTAINABILITY, IMPUTATIONS AND MISSING DATA: 12-MONTHS AFTER A PHYSICAL ACTIVITY COUNSELING TRIAL
    不详
    [J]. GERONTOLOGIST, 2009, 49 : 212 - 212
  • [36] CubeViz - Exploration and Visualization of Statistical Linked Data
    Martin, Michael
    Abicht, Konrad
    Stadler, Claus
    Auer, Soeren
    Ngomo, Axel-C. Ngonga
    Soru, Tommaso
    [J]. WWW'15 COMPANION: PROCEEDINGS OF THE 24TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2015, : 219 - 222
  • [37] Exploration of Climate Data Using Interactive Visualization
    Ladstaedter, Florian
    Steiner, Andrea K.
    Lackner, Bettina C.
    Pirscher, Barbara
    Kirchengast, Gottfried
    Kehrer, Johannes
    Hauser, Helwig
    Muigg, Philipp
    Doleisch, Helmut
    [J]. JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY, 2010, 27 (04) : 667 - 679
  • [38] Spatial analysis and visualization of exploration geochemical data
    Zuo, Renguang
    Carranza, Emmanuel John M.
    Wang, Jian
    [J]. EARTH-SCIENCE REVIEWS, 2016, 158 : 9 - 18
  • [39] The Exploration Machine - A Novel Method for Data Visualization
    Wismueller, Axel
    [J]. ADVANCES IN SELF-ORGANIZING MAPS, PROCEEDINGS, 2009, 5629 : 344 - 352
  • [40] VISAGE - A VISUALIZATION AND EXPLORATION FRAMEWORK FOR ENVIRONMENTAL DATA
    Conover, Helen
    Berendes, Todd
    Gatlin, Patrick
    Maskey, Manil
    Naeger, Aaron
    Wingo, Stephanie
    Kulkarni, Ajinkya
    Marouane, Abdelhak
    Wang, Lihua
    Ellingson, Brian
    Dahal, Bibek
    Singhirunnusorn, Khomsun
    [J]. 2019 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2019), 2019, : 5405 - 5408