Clustering in applications with multiple data sources-A mutual subspace clustering approach

被引:6
|
作者
Hua, Ming [2 ]
Pei, Jian [1 ]
机构
[1] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC V5A 1S6, Canada
[2] Facebook Inc, Palo Alto, CA USA
基金
加拿大自然科学与工程研究理事会;
关键词
Clustering; Multiple sources;
D O I
10.1016/j.neucom.2011.08.032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many applications, such as bioinformatics and cross-market customer relationship management, there are data from multiple sources jointly describing the same set of objects. An important data mining task is to find interesting groups of objects that form clusters in subspaces of the data sources jointly supported by those data sources. In this paper, we study a novel problem of mining mutual subspace clusters from multiple sources. We develop two interesting models and the corresponding methods for mutual subspace clustering. The density-based model identifies dense regions in subspaces as clusters. The bottom-up method searches for density-based mutual subspace clusters systematically from low-dimensional subspaces to high-dimensional ones. The partitioning model divides points in a data set into k exclusive clusters and a signature subspace is found for each cluster, where k is the number of clusters desired by a user. The top-down method interleaves the well-known k-means clustering procedures in multiple sources. We use experimental results on synthetic data sets and real data sets to report the effectiveness and the efficiency of the methods. (c) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:133 / 144
页数:12
相关论文
共 50 条
  • [31] In Pursuit of Novelty: A Decentralized Approach to Subspace Clustering
    Rahmani, Mostafa
    Atia, George K.
    2016 54TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2016, : 447 - 451
  • [32] SUBSPACE CLUSTERING USING UNSUPERVISED DATA AUGMENTATION
    Abdolali, Maryam
    Gillis, Nicolas
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3868 - 3872
  • [33] Kernel Subspace Clustering Algorithm for Categorical Data
    Xu K.-P.
    Chen L.-F.
    Sun H.-J.
    Wang B.-Z.
    Ruan Jian Xue Bao/Journal of Software, 2020, 31 (11): : 3492 - 3505
  • [34] SPARSE SUBSPACE CLUSTERING WITH MISSING AND CORRUPTED DATA
    Charles, Zachary
    Jalali, Amin
    Willett, Rebecca
    2018 IEEE DATA SCIENCE WORKSHOP (DSW), 2018, : 180 - 184
  • [35] Automatic Subspace Clustering of High Dimensional Data
    Rakesh Agrawal
    Johannes Gehrke
    Dimitrios Gunopulos
    Prabhakar Raghavan
    Data Mining and Knowledge Discovery, 2005, 11 : 5 - 33
  • [36] Subspace Clustering with Feature Grouping for Categorical Data
    Jia, Hong
    Dong, Menghan
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, KSEM 2023, 2023, 14117 : 247 - 254
  • [37] A subspace hierarchical clustering algorithm for categorical data
    Carbonera, Joel Luis
    Abel, Mara
    2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 509 - 516
  • [38] Parallel Hierarchical Subspace Clustering of Categorical Data
    Pang, Ning
    Zhang, Jifu
    Zhang, Chaowei
    Qin, Xiao
    IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (04) : 542 - 555
  • [39] Subspace clustering of dimensionality-reduced data
    Heckel, Reinhard
    Tschannen, Michael
    Boelcskei, Helmut
    2014 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2014, : 2997 - 3001
  • [40] Automatic subspace clustering of high dimensional data
    Agrawal, R
    Gehrke, J
    Gunopulos, D
    Raghavan, P
    DATA MINING AND KNOWLEDGE DISCOVERY, 2005, 11 (01) : 5 - 33