Incomplete clustering analysis via multiple imputation

被引:1
|
作者
Lee, Jung Wun [1 ]
Harel, Ofer [1 ]
机构
[1] Univ Connecticut, Dept Stat, 215 Glenbrook Rd Unit 4120, Storrs, CT 06269 USA
基金
美国国家科学基金会;
关键词
Incomplete data; model-based clustering; cluster analysis; multiple imputation; missing data; NUMBER;
D O I
10.1080/02664763.2022.2060952
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Clustering analysis is a prevalent statistical method which divides populations into several subgroups of similar units. However, most existing clustering methods require complete data. One general method that addresses incomplete data is multiple imputation (MI) which avoids many limitations found in other single imputation-based methods and complete case analyses. Nevertheless, adopting MI framework to clustering analysis can be challenging since each imputed data might consist of a different number of clusters and there is not a unique parameter for clustering analysis. In response to this problem, we have developed MICA: Multiply Imputed Cluster Analysis. MICA is a framework for clustering incomplete data consisting of two clustering stages. We assess the properties of MICA and its superiority over other existing incomplete clustering strategies based on a simulation study under various data structures. In addition, we demonstrate the usage of MICA by applying it to the Youth Risk Behavior Surveillance System (YRBSS) 2019 data.
引用
下载
收藏
页码:1962 / 1979
页数:18
相关论文
共 50 条
  • [1] Incomplete multi-view clustering with multiple imputation and ensemble clustering
    Guoqing Chao
    Songtao Wang
    Shiming Yang
    Chunshan Li
    Dianhui Chu
    Applied Intelligence, 2022, 52 : 14811 - 14821
  • [2] Incomplete multi-view clustering with multiple imputation and ensemble clustering
    Chao, Guoqing
    Wang, Songtao
    Yang, Shiming
    Li, Chunshan
    Chu, Dianhui
    APPLIED INTELLIGENCE, 2022, 52 (13) : 14811 - 14821
  • [3] Multiple imputation for the analysis of incomplete compound variables
    Zhao, Jiwei
    Cook, Richard J.
    Wu, Changbao
    CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2015, 43 (02): : 240 - 264
  • [4] Incomplete Multi-view Clustering via Prototype-based Imputation
    Li, Haobin
    Li, Yunfan
    Yang, Mouxing
    Hu, Peng
    Peng, Dezhong
    Peng, Xi
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 3911 - 3919
  • [5] On using multiple imputation for exploratory factor analysis of incomplete data
    Nassiri, Vahid
    Lovik, Aniko
    Molenberghs, Geert
    Verbeke, Geert
    BEHAVIOR RESEARCH METHODS, 2018, 50 (02) : 501 - 517
  • [6] Using multiple imputation for analysis of incomplete data in clinical research
    McCleary, L
    NURSING RESEARCH, 2002, 51 (05) : 339 - 343
  • [7] On using multiple imputation for exploratory factor analysis of incomplete data
    Vahid Nassiri
    Anikó Lovik
    Geert Molenberghs
    Geert Verbeke
    Behavior Research Methods, 2018, 50 : 501 - 517
  • [8] Analysis of incomplete longitudinal binary data using multiple imputation
    Li, Xiaoming
    Mehrotra, Devan V.
    Barnard, John
    STATISTICS IN MEDICINE, 2006, 25 (12) : 2107 - 2124
  • [9] Multiple imputation for an incomplete covariate that is a ratio
    Morris, Tim P.
    White, Ian R.
    Royston, Patrick
    Seaman, Shaun R.
    Wood, Angela M.
    STATISTICS IN MEDICINE, 2014, 33 (01) : 88 - 104
  • [10] An Improved Mean Imputation Clustering Algorithm for Incomplete Data
    Shi, Hong
    Wang, Pingxin
    Yang, Xin
    Yu, Hualong
    NEURAL PROCESSING LETTERS, 2022, 54 (05) : 3537 - 3550