DBSC: A dependency-based subspace clustering algorithm for high dimensional numerical datasets

被引:0
|
作者
Wang, Xufei [1 ]
Li, Chunping [1 ]
机构
[1] Tsinghua Univ, Sch Software, China MOE Key Lab Informat Syst Secur, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel algorithm called DBSC, which finds subspace clusters in numerical datasets based on the concept of "dependency". This algorithm uses a depth-first search strategy to find out the maximal subspaces: a new dimension is added to current k-subspace and its validity as a (k+1)-subspace is evaluated. The clusters within those maximal subspaces are mined in a similar fashion as maximal subspace mining does. With the experiments on synthetic and real datasets, our algorithm is shown to be both effective and efficient for high dimensional datasets.
引用
收藏
页码:832 / 837
页数:6
相关论文
共 50 条
  • [1] Gaussian mixture copulas for high-dimensional clustering and dependency-based subtyping
    Kasa, Siva Rajesh
    Bhattacharya, Sakyajit
    Rajan, Vaibhav
    [J]. BIOINFORMATICS, 2020, 36 (02) : 621 - 628
  • [2] A novel soft subspace clustering algorithm with noise detection for high dimensional datasets
    Chitsaz, Elham
    Jahromi, Mansoor Zolghadri
    [J]. SOFT COMPUTING, 2016, 20 (11) : 4463 - 4472
  • [3] A novel soft subspace clustering algorithm with noise detection for high dimensional datasets
    Elham Chitsaz
    Mansoor Zolghadri Jahromi
    [J]. Soft Computing, 2016, 20 : 4463 - 4472
  • [4] A fuzzy subspace algorithm for clustering high dimensional data
    Can, Guojun
    Wu, Jianhong
    Yang, Zijiang
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2006, 4093 : 271 - 278
  • [5] MR-Mafia: Parallel Subspace Clustering Algorithm Based on MapReduce For Large Multi-dimensional Datasets
    Gao, Zhipeng
    Fan, Yidan
    Niu, Kun
    Ying, Zhenyi
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 257 - 262
  • [6] Evolutionary Subspace Clustering Algorithm for High-Dimensional Data
    Nourashrafeddin, S. N.
    Arnold, Dirk V.
    Milios, Evangelos
    [J]. PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE ON GENETIC AND EVOLUTIONARY COMPUTATION COMPANION (GECCO'12), 2012, : 1497 - 1498
  • [7] PARTCAT: A subspace clustering algorithm for high dimensional categorical data
    Gan, Guojun
    Wu, Jianhong
    Yang, Zijiang
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 4406 - +
  • [8] Parallel social spider clustering algorithm for high dimensional datasets
    Shukla, Urvashi Prakash
    Nanda, Satyasai Jagannath
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 56 : 75 - 90
  • [9] A grid-based subspace clustering algorithm for high-dimensional data streams
    Sun, Yufen
    Lu, Yansheng
    [J]. WEB INFORMATION SYSTEMS - WISE 2006 WORKSHOPS, PROCEEDINGS, 2006, 4256 : 37 - 48
  • [10] Independence is good: Dependency-based histogram synopses for high-dimensional data
    Deshpande, A
    Garofalakis, M
    Rastogi, R
    [J]. SIGMOD RECORD, 2001, 30 (02) : 199 - 210