CLINCH: Clustering incomplete high-dimensional data for data mining application

被引:0
|
作者
Cheng, ZP [1 ]
Zhou, D
Wang, C
Guo, JK
Wang, W
Ding, BK
Shi, B
机构
[1] Fudan Univ, Shanghai, Peoples R China
[2] Penn State Univ, University Pk, PA 16802 USA
关键词
clustering; incomplete data; high-dimensional data;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is a common technique in data mining to discover hidden patterns from massive datasets. With the development of privacy-maintaining data mining application, clustering incomplete high-dimensional data has becoming more and more useful. Motivated by these limits, we develop a novel algorithm CLINCH, which could produce fine clusters on incomplete high-dimensional data space. To handle missing attributes, CLINCH employs a prediction method that can be more precise than traditional techniques. On the other hand, we also introduce an efficient way in which dimensions are processed one by one to attack the "curse of dimensionality". Experiments show that our algorithm not only outperforms many existing high-dimensional clustering algorithms in scalability and efficiency, but also produces precise results.
引用
收藏
页码:88 / 99
页数:12
相关论文
共 50 条
  • [1] An efficient clustering method of data mining for high-dimensional data
    Chang, JW
    Kang, HM
    [J]. 8TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTING TECHNIQUES, 2004, : 273 - 278
  • [2] An efficient clustering method for high-dimensional data mining
    Chang, JW
    Kim, YK
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE - SBIA 2004, 2004, 3171 : 276 - 285
  • [3] Clustering High-Dimensional Stock Data using Data Mining Approach
    Indriyanti, Dhea
    Dhini, Arian
    [J]. 2019 16TH INTERNATIONAL CONFERENCE ON SERVICE SYSTEMS AND SERVICE MANAGEMENT (ICSSSM2019), 2019,
  • [4] High-dimensional clustering method for high performance data mining
    Chang, Jae-Woo
    Lee, Hyun-Jo
    [J]. COMPUTATIONAL SCIENCE - ICCS 2007, PT 3, PROCEEDINGS, 2007, 4489 : 621 - +
  • [5] Clustering Lines in High-Dimensional Space: Classification of Incomplete Data
    Gao, Jie
    Langberg, Michael
    Schulman, Leonard J.
    [J]. ACM TRANSACTIONS ON ALGORITHMS, 2010, 7 (01)
  • [6] High-dimensional data clustering
    Bouveyron, C.
    Girard, S.
    Schmid, C.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 52 (01) : 502 - 519
  • [7] Clustering High-Dimensional Data
    Masulli, Francesco
    Rovetta, Stefano
    [J]. CLUSTERING HIGH-DIMENSIONAL DATA, CHDD 2012, 2015, 7627 : 1 - 13
  • [8] Visualization and data mining of high-dimensional data
    Inselberg, A
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2002, 60 (1-2) : 147 - 159
  • [9] Relationship-based clustering and visualization for high-dimensional data mining
    Strehl, A
    Ghosh, J
    [J]. INFORMS JOURNAL ON COMPUTING, 2003, 15 (02) : 208 - 230
  • [10] Clustering of High-Dimensional and Correlated Data
    McLachlan, Geoffrey J.
    Ng, Shu-Kay
    Wang, K.
    [J]. DATA ANALYSIS AND CLASSIFICATION, 2010, : 3 - 11