Clustering in applications with multiple data sources-A mutual subspace clustering approach

被引:6
|
作者
Hua, Ming [2 ]
Pei, Jian [1 ]
机构
[1] Simon Fraser Univ, Sch Comp Sci, Burnaby, BC V5A 1S6, Canada
[2] Facebook Inc, Palo Alto, CA USA
基金
加拿大自然科学与工程研究理事会;
关键词
Clustering; Multiple sources;
D O I
10.1016/j.neucom.2011.08.032
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In many applications, such as bioinformatics and cross-market customer relationship management, there are data from multiple sources jointly describing the same set of objects. An important data mining task is to find interesting groups of objects that form clusters in subspaces of the data sources jointly supported by those data sources. In this paper, we study a novel problem of mining mutual subspace clusters from multiple sources. We develop two interesting models and the corresponding methods for mutual subspace clustering. The density-based model identifies dense regions in subspaces as clusters. The bottom-up method searches for density-based mutual subspace clusters systematically from low-dimensional subspaces to high-dimensional ones. The partitioning model divides points in a data set into k exclusive clusters and a signature subspace is found for each cluster, where k is the number of clusters desired by a user. The top-down method interleaves the well-known k-means clustering procedures in multiple sources. We use experimental results on synthetic data sets and real data sets to report the effectiveness and the efficiency of the methods. (c) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:133 / 144
页数:12
相关论文
共 50 条
  • [1] Incremental subspace clustering over multiple data streams
    Zhang, Qi
    Liu, Jinze
    Wang, Wei
    ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, : 727 - 732
  • [2] Subspace Clustering for Sequential Data
    Tierney, Stephen
    Gao, Junbin
    Guo, Yi
    2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1019 - 1026
  • [3] A Convex Approach to Subspace Clustering
    Ohlsson, Henrik
    Ljung, Lennart
    2011 50TH IEEE CONFERENCE ON DECISION AND CONTROL AND EUROPEAN CONTROL CONFERENCE (CDC-ECC), 2011, : 1467 - 1472
  • [4] Subspace clustering of high-dimensional data: a predictive approach
    Brian McWilliams
    Giovanni Montana
    Data Mining and Knowledge Discovery, 2014, 28 : 736 - 772
  • [5] Subspace clustering of high-dimensional data: a predictive approach
    McWilliams, Brian
    Montana, Giovanni
    DATA MINING AND KNOWLEDGE DISCOVERY, 2014, 28 (03) : 736 - 772
  • [6] Subspace Clustering of High-Dimensional Data: An Evolutionary Approach
    Vijendra, Singh
    Laxman, Sahoo
    APPLIED COMPUTATIONAL INTELLIGENCE AND SOFT COMPUTING, 2013, 2013
  • [7] Projective Multiple Kernel Subspace Clustering
    Sun, Mengjing
    Wang, Siwei
    Zhang, Pei
    Liu, Xinwang
    Guo, Xifeng
    Zhou, Sihang
    Zhu, En
    IEEE Transactions on Multimedia, 2022, 24 : 2567 - 2579
  • [8] Projective Multiple Kernel Subspace Clustering
    Sun, Mengjing
    Wang, Siwei
    Zhang, Pei
    Liu, Xinwang
    Guo, Xifeng
    Zhou, Sihang
    Zhu, En
    IEEE TRANSACTIONS ON MULTIMEDIA, 2022, 24 : 2567 - 2579
  • [9] Combining Multiple Features for Web Data Sources Clustering
    Algergawy, Alsayed
    Saake, Gunter
    2013 IEEE 10TH INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE), 2013, : 213 - 218
  • [10] Revisiting data augmentation for subspace clustering
    Abdolali, Maryam
    Gillis, Nicolas
    KNOWLEDGE-BASED SYSTEMS, 2022, 258