Clustering Algorithms for Chains

被引:0
|
作者
Ukkonen, Antti [1 ]
机构
[1] Yahoo Res, Barcelona 08018, Spain
基金
芬兰科学院;
关键词
Lloyd's algorithm; orders; preference statements; planted partition model; randomization testing;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We consider the problem of clustering a set of chains to k clusters. A chain is a totally ordered subset of a finite set of items. Chains are an intuitive way to express preferences over a set of alternatives, as well as a useful representation of ratings in situations where the item-specific scores are either difficult to obtain, too noisy due to measurement error, or simply not as relevant as the order that they induce over the items. First we adapt the classical k-means for chains by proposing a suitable distance function and a centroid structure. We also present two different approaches for mapping chains to a vector space. The first one is related to the planted partition model, while the second one has an intuitive geometrical interpretation. Finally we discuss a randomization test for assessing the significance of a clustering. To this end we present an MCMC algorithm for sampling random sets of chains that share certain properties with the original data. The methods are studied in a series of experiments using real and artificial data. Results indicate that the methods produce interesting clusterings, and for certain types of inputs improve upon previous work on clustering algorithms for orders.
引用
收藏
页码:1389 / 1423
页数:35
相关论文
共 50 条
  • [1] Clustering algorithms for chains
    Ukkonen, Antti
    [J]. Journal of Machine Learning Research, 2011, 12 : 1389 - 1423
  • [2] CONSUMER BEHAVIOR CLUSTERING OF FOOD RETAIL CHAINS BY MACHINE LEARNING ALGORITHMS
    Liashenko, Olena
    Kravets, Tetyana
    Prokopenko, Matvii
    [J]. ACCESS-ACCESS TO SCIENCE BUSINESS INNOVATION IN THE DIGITAL ECONOMY, 2021, 2 (03): : 234 - 251
  • [3] Genetic algorithms for clustering and fuzzy clustering
    Bandyopadhyay, Sanghamitra
    [J]. WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY, 2011, 1 (06) : 524 - 531
  • [4] Convergence Theorems of Possibilistic Clustering Algorithms and Generalized Possibilistic Clustering Algorithms
    Lin, Qihang
    Zhou, Jian
    [J]. PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON INFORMATION AND MANAGEMENT SCIENCES, 2009, 8 : 950 - 957
  • [5] Landscape of clustering algorithms
    Jain, AK
    Topchy, A
    Law, MHC
    Buhmann, JM
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION, VOL 1, 2004, : 260 - 263
  • [6] Gradual clustering algorithms
    Wu, F
    Gardarin, G
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2001, : 48 - 55
  • [7] SURVEY OF CLUSTERING ALGORITHMS
    WATANABE, S
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1971, SMC1 (04): : 398 - &
  • [8] Online clustering algorithms
    Barbakh, Wesam
    Fyfe, Colin
    [J]. INTERNATIONAL JOURNAL OF NEURAL SYSTEMS, 2008, 18 (03) : 185 - 194
  • [9] Genetic clustering algorithms
    Chiou, YC
    Lan, LW
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2001, 135 (02) : 413 - 427
  • [10] DNA Clustering Algorithms
    Stepanyan, I. V.
    [J]. AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS, 2021, 55 (01) : 1 - 7