A hypothesis test for comparing two partitions obtained from the same dataset

被引:0
|
作者
Bourel, Mathias [1 ]
Ghattas, Badih [2 ]
Gonzalez, Meliza [1 ]
机构
[1] Univ Republica, Inst Matemat & Estadist, Montevideo, Uruguay
[2] Aix Marseille Sch Econ, Marseille, France
关键词
Clustering; Comparing partitions; Hypothesis test; Matching error; CLUSTERINGS; CRITERIA;
D O I
10.1080/03610918.2025.2458574
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose a non parametric hypothesis test to compare two partitions of a same data set. The partitions may result from two different clustering approaches. The test may be done using any comparison index but we focus in particular on the Matching Error (ME) that is related to the misclassification error in supervised learning. Some properties of the ME and, especially, its distribution function for the case of two different partitions are analyzed. Extensive simulations and experiments show the efficiency of the test.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] CONCENTRATIONS OF PATCH TEST ALLERGENS - ARE WE COMPARING SAME THINGS
    BENEZRA, C
    ANDANSON, J
    CHABEAU, C
    DUCOMBS, G
    FOUSSEREAU, J
    LACHAPELLE, JM
    LACROIX, M
    MARTIN, P
    CONTACT DERMATITIS, 1978, 4 (02) : 103 - 105
  • [22] Comparing the Expected Misclassification Cost for Two Classifiers Based on Estimates From the Same Sample
    Troendle, James F.
    Yu, Kai F.
    Westfall, Peter H.
    Pennello, Gene
    Schisterman, Enrique F.
    STATISTICS IN BIOPHARMACEUTICAL RESEARCH, 2012, 4 (03): : 301 - 312
  • [23] TEST OF THE HYPOTHESIS THAT THE INTRACLASS RELIABILITY COEFFICIENT IS THE SAME FOR 2 MEASUREMENT PROCEDURES
    ALSAWALMEH, YM
    FELDT, LS
    APPLIED PSYCHOLOGICAL MEASUREMENT, 1992, 16 (02) : 195 - 205
  • [25] Pitfalls in dosimetric analysis: precision obtained by various users on the same patient dataset and dosimetry package
    Kayal, G.
    Parada, N. Barbosa
    Marin, C. Calderon
    Ferrer, L.
    Negrin, J. A. F.
    Grosev, D.
    Gupta, S.
    Hidayati, N. R.
    Hobbs, R.
    Moalosi, T. C. G.
    Poli, G.
    Thakral, P.
    Tsapaki, V.
    Vauclin, S.
    Vergara-Gil, A.
    Knoll, P.
    Bardies, M.
    EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2022, 49 (SUPPL 1) : S393 - S393
  • [26] A statistical test that two individuals are from the same randomly mating population.
    Scott, N. M.
    Stewart, W. C. L.
    Long, J. C.
    AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY, 2009, : 234 - 234
  • [27] Assessing a hypothesis test for the difference between two quantiles from independent populations
    Heinzl, Harald
    Mittlboeck, Martina
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2017, 46 (05) : 3540 - 3552
  • [28] COMPARING 2 PROPORTIONS FROM THE SAME SURVEY
    WILD, CJ
    SEBER, GAF
    AMERICAN STATISTICIAN, 1993, 47 (03): : 178 - 181
  • [29] Algorithm for comparing two different Printouts of the same PDF Document
    Goel, Vaibhav
    International Conference on Computational Intelligence for Modelling, Control & Automation Jointly with International Conference on Intelligent Agents, Web Technologies & Internet Commerce, Vol 1, Proceedings, 2006, : 762 - 767
  • [30] A transcriptomic dataset comparing two methods of hepatocyte differentiation from human induced pluripotent stem cells
    Gao, Xiugong
    Li, Rong
    Yourick, Jeffrey J.
    Sprando, Robert L.
    DATA IN BRIEF, 2022, 43