A model for clustering data from heterogeneous dissimilarities

被引:14
|
作者
Santi, Everton [1 ]
Aloise, Daniel [2 ]
Blanchard, Simon J. [3 ]
机构
[1] Univ Fed Rio Grande do Norte, Sch Sci & Technol, BR-59072970 Natal, RN, Brazil
[2] Univ Fed Rio Grande do Norte, Dept Comp Engn & Automat, BR-59072970 Natal, RN, Brazil
[3] Georgetown Univ, McDonough Sch Business, Washington, DC 20057 USA
关键词
Data mining; Clustering; Heterogeneity; Optimization; Heuristics; VARIABLE NEIGHBORHOOD SEARCH; P-MEDIAN PROBLEM; CONSUMER; CONTEXT; BRANCH; REPRESENTATIONS; SUBSTITUTION; RELAXATIONS; PREFERENCE; ALGORITHM;
D O I
10.1016/j.ejor.2016.03.033
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
Clustering algorithms partition a set of n objects into p groups (called clusters), such that objects assigned to the same groups are homogeneous according to some criteria. To derive these clusters, the data input required is often a single n x n dissimilarity matrix. Yet for many applications, more than one instance of the dissimilarity matrix is available and so to conform to model requirements, it is common practice to aggregate (e.g., sum up, average) the matrices. This aggregation practice results in clustering solutions that mask the true nature of the original data. In this paper we introduce a clustering model which, to handle the heterogeneity, uses all available dissimilarity matrices and identifies for groups of individuals clustering objects in a similar way. The model is a nonconvex problem and difficult to solve exactly, and we thus introduce a Variable Neighborhood Search heuristic to provide solutions efficiently. Computational experiments and an empirical application to perception of chocolate candy show that the heuristic algorithm is efficient and that the proposed model is suited for recovering heterogeneous data. Implications for clustering researchers are discussed. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:659 / 672
页数:14
相关论文
共 50 条
  • [21] Evidence Identification in Heterogeneous Data Using Clustering
    Mohammed, Hussam
    Clarke, Nathan
    Li, Fudong
    13TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY (ARES 2018), 2019,
  • [22] Transfer Heterogeneous Unlabeled Data for Unsupervised Clustering
    Kong, Shu
    Wang, Donghui
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 1193 - 1196
  • [23] Heterogeneous data integration with the consensus clustering formalism
    Filkov, V
    Skiena, S
    DATA INTEGRATION IN THE LIFE SCIENCES, PROCEEDINGS, 2004, 2994 : 110 - 123
  • [24] Federated learning with incremental clustering for heterogeneous data
    Espinoza Castellon, Fabiola
    Mayoue, Aurelien
    Sublemontier, Jacques-Henri
    Gouy-Pailler, Cedric
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [25] Cross-Domain Multilingual Clustering: A Generative Hybrid Model for Constructing and Enhancing Semantic Graphs from Heterogeneous Data
    Amani Mechergui
    Wahiba Ben Abdessalem Karaa
    Sami Zghal
    SN Computer Science, 5 (8)
  • [26] Group Clustering Using Inter-Group Dissimilarities
    Fesehaye, Debessay
    Singaravelu, Lenin
    Chen, Chien-Chia
    Huang, Xiaobo
    Banerjee, Amitabha
    Zhou, Ruijin
    Somasundaran, Rajesh
    2017 IEEE 37TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS 2017), 2017, : 1011 - 1021
  • [27] Spectral Clustering for Cell Formation with Minimum Dissimilarities Distance
    Nataliani, Yessica
    Yang, Miin-Shen
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2017, PT II, 2017, 10246 : 126 - 136
  • [28] A novel probabilistic clustering model for heterogeneous networks
    Zhi-Hong Deng
    Xiaoran Xu
    Machine Learning, 2016, 104 : 1 - 24
  • [29] A novel probabilistic clustering model for heterogeneous networks
    Deng, Zhi-Hong
    Xu, Xiaoran
    MACHINE LEARNING, 2016, 104 (01) : 1 - 24
  • [30] Quantifying dissimilarities between heterogeneous networks with community structure
    Xu, Xin-Jian
    Chen, Cheng
    Mendes, J. F. F.
    PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2022, 588