Clustering Algorithms for Chains

被引:0
|
作者
Ukkonen, Antti [1 ]
机构
[1] Yahoo Res, Barcelona 08018, Spain
基金
芬兰科学院;
关键词
Lloyd's algorithm; orders; preference statements; planted partition model; randomization testing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of clustering a set of chains to k clusters. A chain is a totally ordered subset of a finite set of items. Chains are an intuitive way to express preferences over a set of alternatives, as well as a useful representation of ratings in situations where the item-specific scores are either difficult to obtain, too noisy due to measurement error, or simply not as relevant as the order that they induce over the items. First we adapt the classical k-means for chains by proposing a suitable distance function and a centroid structure. We also present two different approaches for mapping chains to a vector space. The first one is related to the planted partition model, while the second one has an intuitive geometrical interpretation. Finally we discuss a randomization test for assessing the significance of a clustering. To this end we present an MCMC algorithm for sampling random sets of chains that share certain properties with the original data. The methods are studied in a series of experiments using real and artificial data. Results indicate that the methods produce interesting clusterings, and for certain types of inputs improve upon previous work on clustering algorithms for orders.
引用
收藏
页码:1389 / 1423
页数:35
相关论文
共 50 条
  • [41] A Survey of Evolutionary Algorithms for Clustering
    Hruschka, Eduardo Raul
    Campello, Ricardo J. G. B.
    Freitas, Alex A.
    de Carvalho, Andre C. Ponce Leon F.
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2009, 39 (02): : 133 - 155
  • [42] Initialization Dependence of Clustering Algorithms
    De Mulder, Wim
    Schliebs, Stefan
    Boel, Rene
    Kuiper, Martin
    [J]. ADVANCES IN NEURO-INFORMATION PROCESSING, PT II, 2009, 5507 : 615 - +
  • [43] CLUSTERING ALGORITHMS - HARTIGAN,JA
    ROMESBURG, HC
    [J]. JOURNAL OF LEISURE RESEARCH, 1979, 11 (02) : 154 - 156
  • [44] Scalable fuzzy clustering algorithms
    Hall, Lawrence O.
    [J]. 2008 ANNUAL MEETING OF THE NORTH AMERICAN FUZZY INFORMATION PROCESSING SOCIETY, VOLS 1 AND 2, 2008, : 852 - 853
  • [45] The Georgi algorithms of jet clustering
    Ge, Shao-Feng
    [J]. JOURNAL OF HIGH ENERGY PHYSICS, 2015, (05):
  • [46] Exact and approximation algorithms for clustering
    Agarwal, PK
    Procopiuc, CM
    [J]. ALGORITHMICA, 2002, 33 (02) : 201 - 226
  • [47] On the stability of software clustering algorithms
    Tzerpos, V
    Holt, RC
    [J]. 8TH INTERNATIONAL WORKSHOP ON PROGRAM COMPREHENSION (IWPC 2000), PROCEEDINGS, 2000, : 211 - 218
  • [48] Spectral algorithms for learning and clustering
    Vempala, Santosh S.
    [J]. Learning Theory, Proceedings, 2007, 4539 : 3 - 4
  • [49] A review of conceptual clustering algorithms
    Perez-Suarez, Airel
    Martinez-Trinidad, Jose F.
    Carrasco-Ochoa, Jesus A.
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2019, 52 (02) : 1267 - 1296
  • [50] Clustering algorithms: A comparative approach
    Rodriguez, Mayra Z.
    Comin, Cesar H.
    Casanova, Dalcimar
    Bruno, Odemir M.
    Amancio, Diego R.
    Costa, Luciano da F.
    Rodrigues, Francisco A.
    [J]. PLOS ONE, 2019, 14 (01):