Model-based clustering for multivariate partial ranking data

被引:22
|
作者
Jacques, Julien [1 ,2 ,3 ]
Biernacki, Christophe [1 ,2 ,3 ]
机构
[1] Univ Lille 1, F-59655 Villeneuve Dascq, France
[2] CNRS, F-75700 Paris, France
[3] Inria, Paris, France
关键词
Multivariate ranking; Partial ranking; Mixture model; Insertion sort rank; SEM algorithm; Gibbs sampling; MIXTURE;
D O I
10.1016/j.jspi.2014.02.011
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
This paper proposes the first model-based clustering algorithm dedicated to multivariate partial ranking data. This is an extension of the Insertion Sorting Rank (ISR) model for ranking data, which has the dual property to be a meaningful model through its location and scale parameters description and to be a kind of "physical" model through its derivation from the ranking generating process assumed to be an insertion sorting algorithm. The heterogeneity of the rank population is modeled by a mixture of BR, whereas a conditional independence assumption allows the extension to multivariate ranking. Maximum likelihood estimation is performed through a SEM-Gibbs algorithm, and partial rankings are considered as missing data, that allows us to simulate them during the estimation process. After having validated the estimation algorithm as well as the robustness of the model on simulated datasets, three real datasets were studied: the 1980 American Psychological Association (APA) presidential election votes, the results of French students to a general knowledge test and the votes of the European countries to the Eurovision song contest. The proposed model appears to be relevant in comparison with the most standard competitor ranking models (when available) and leads to significant interpretation for each application. In particular, regional alliances between European countries are exhibited in the Eurovision contest, which are often suspected but never proved. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:201 / 217
页数:17
相关论文
共 50 条
  • [1] Model-based clustering for multivariate functional data
    Jacques, Julien
    Preda, Cristian
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2014, 71 : 92 - 106
  • [2] Probabilistic model-based clustering of multivariate and sequential data
    Smyth, P
    [J]. ARTIFICIAL INTELLIGENCE AND STATISTICS 99, PROCEEDINGS, 1999, : 299 - 304
  • [3] Model-based simultaneous clustering and ordination of multivariate abundance data in ecology
    Hui, Francis K. C.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2017, 105 : 1 - 10
  • [4] Model-based clustering of multivariate skew data with circular components and missing values
    Lagona, Francesco
    Picone, Marco
    [J]. JOURNAL OF APPLIED STATISTICS, 2012, 39 (05) : 927 - 945
  • [5] Fast model-based clustering of partial records
    Goren, Emily M.
    Maitra, Ranjan
    [J]. STAT, 2022, 11 (01):
  • [6] A Model-Based Multivariate Time Series Clustering Algorithm
    Zhou, Pei-Yuan
    Chan, Keith C. C.
    [J]. TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, 2014, 8643 : 805 - 817
  • [7] Model-based clustering of longitudinal data
    McNicholas, Paul D.
    Murphy, T. Brendan
    [J]. CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2010, 38 (01): : 153 - 168
  • [8] Boosting for model-based data clustering
    Saffari, Amir
    Bischof, Horst
    [J]. PATTERN RECOGNITION, 2008, 5096 : 51 - 60
  • [9] Model-based clustering for longitudinal data
    De la Cruz-Mesia, Rolando
    Quintanab, Fernando A.
    Marshall, Guillermo
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (03) : 1441 - 1457
  • [10] Model-Based Clustering of Temporal Data
    El Assaad, Hani
    Same, Allou
    Govaert, Gerard
    Aknin, Patrice
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2013, 2013, 8131 : 9 - 16