An examination of indexes for determining the number of clusters in binary data sets

被引:0
|
作者
Evgenia Dimitriadou
Sara Dolničar
Andreas Weingessel
机构
[1] Technische Universität Wien,Institut für Statistik und Wahrscheinlichkeitstheorie
[2] Wirtschaftsuniversität wien,Institut für Tourismus und Freizeitwirtschaft
来源
Psychometrika | 2002年 / 67卷
关键词
number of clusters; clustering indexes; binary data; artificial data sets; market segmentation;
D O I
暂无
中图分类号
学科分类号
摘要
The problem of choosing the correct number of clusters is as old as cluster analysis itself. A number of authors have suggested various indexes to facilitate this crucial decision. One of the most extensive comparative studies of indexes was conducted by Milligan and Cooper (1985). The present piece of work pursues the same goal under different conditions. In contrast to Milligan and Cooper's work, the emphasis here is on high-dimensional empirical binary data. Binary artificial data sets are constructed to reflect features typically encountered in real-world data situations in the field of marketing research. The simulation includes 162 binary data sets that are clustered by two different algorithms and lead to recommendations on the number of clusters for each index under consideration. Index results are evaluated and their performance is compared and analyzed.
引用
收藏
页码:137 / 159
页数:22
相关论文
共 50 条
  • [1] An examination of indexes for determining the number of clusters in binary data sets
    Dimitriadou, E
    Dolnicar, S
    Weingessel, A
    PSYCHOMETRIKA, 2002, 67 (01) : 137 - 159
  • [2] Automatically Determining the Number of Clusters in Unlabeled Data Sets
    Wang, Liang
    Leckie, Christopher
    Ramamohanarao, Kotagiri
    Bezdek, James
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2009, 21 (03) : 335 - 350
  • [3] AN EXAMINATION OF PROCEDURES FOR DETERMINING THE NUMBER OF CLUSTERS IN A DATA SET
    MILLIGAN, GW
    COOPER, MC
    PSYCHOMETRIKA, 1985, 50 (02) : 159 - 179
  • [4] EVALUATION OF COEFFICIENTS FOR DETERMINING THE OPTIMAL NUMBER OF CLUSTERS IN CLUSTER ANALYSIS ON REAL DATA SETS
    Loster, Tomas
    9TH INTERNATIONAL DAYS OF STATISTICS AND ECONOMICS, 2015, : 1014 - 1023
  • [5] Dynamic estimation of number of clusters in data sets
    Boudraa, AO
    ELECTRONICS LETTERS, 1999, 35 (19) : 1606 - 1608
  • [6] The influence of the number of clusters on randomly expanded data sets
    van Zyl, J
    Cloete, I
    2003 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-5, PROCEEDINGS, 2003, : 355 - 359
  • [7] Effects of Resampling in Determining the Number of Clusters in a Data Set
    Rainer Dangl
    Friedrich Leisch
    Journal of Classification, 2020, 37 : 558 - 583
  • [8] Effects of Resampling in Determining the Number of Clusters in a Data Set
    Dangl, Rainer
    Leisch, Friedrich
    JOURNAL OF CLASSIFICATION, 2020, 37 (03) : 558 - 583
  • [9] Estimation of the Number of Clusters in Multipath Radio Channel Data Sets
    Mota, Susana
    Perez-Fontan, Fernando
    Rocha, Armando
    IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION, 2013, 61 (05) : 2879 - 2883
  • [10] A new validation index for determining the number of clusters in a data set
    Sun, HJ
    Wang, SG
    Jiang, QS
    IJCNN'01: INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-4, PROCEEDINGS, 2001, : 1852 - 1857