Automatic aspect discrimination in data clustering

被引:10
|
作者
Horta, Danilo [1 ]
Campello, Ricardo J. G. B. [1 ]
机构
[1] Univ Sao Paulo, ICMC, BR-13560970 Sao Carlos, SP, Brazil
基金
巴西圣保罗研究基金会;
关键词
Clustering; Aspect discrimination; Attribute weighting; Cluster validation; FUZZY EXTENSION; RELATIONAL DATA; VALIDITY; AGGREGATION; VALIDATION; ALGORITHMS; COMPLEXITY; CRITERION; INDEXES;
D O I
10.1016/j.patcog.2012.05.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The attributes describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that simultaneously performs fuzzy clustering and aspects weighting was proposed in the literature. However, SCAD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to reduce the number of parameters required to be set by the user. In this paper we prove that each step of the resulting algorithm, named ASCAD, globally minimizes its cost-function with respect to the argument being optimized. The asymptotic analysis of ASCAD leads to a time complexity which is the same as that of fuzzy c-means. A hard version of the algorithm and a novel validity criterion that considers aspect weights in order to estimate the number of clusters are also described. The proposed method is assessed over several artificial and real data sets. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:4370 / 4388
页数:19
相关论文
共 50 条
  • [41] Symmetry Based Automatic Evolution of Clusters: A New Approach to Data Clustering
    Vijendra, Singh
    Laxman, Sahoo
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2015, 2015
  • [42] Clustering Multiple Mix Data Type for Automatic Grouping of Student System
    Pratiwi, Oktariani Nurul
    Rahardjo, Budi
    Supangkat, Suhono Harso
    2017 4TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2017, : 172 - 176
  • [43] User Behavior Clustering Scheme With Automatic Tagging Over Encrypted Data
    Gao, Minghui
    Li, Bo
    Wang, Chen
    Ma, Li
    Xu, Jian
    IEEE ACCESS, 2019, 7 : 170648 - 170657
  • [44] Sidekick: Near Data Processing for Clustering Enhanced by Automatic Memory Disaggregation
    Lee, Sanghoon
    Park, Jongho
    Ha, Minho
    Koh, Byung Il
    Park, Kyoung
    Kim, Yeseong
    2023 60TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC, 2023,
  • [45] Automatic Concept Clustering for Ontological Structure through Data Mining Techniques
    Ramani, R. Geetha
    Sivasankari, S.
    Balasubramanian, Lakshmi
    2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2013, : 485 - 489
  • [46] An Approach of Automatic Web Data Record Extraction Using Clustering Techniques
    Dong, YongQuan
    Li, QingZhong
    2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2009, : 441 - 444
  • [47] Automatic Speech Data Clustering with Human Perception based Weighted Distance
    Wu, Xixin
    Wu, Zhiyong
    Jia, Jia
    Meng, Helen
    Cai, Lianhong
    Li, Weifeng
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 216 - 220
  • [48] Using clustering and edit distance techniques for automatic web data extraction
    Alvarez, Manuel
    Pan, Alberto
    Raposo, Juan
    Bellas, Fernando
    Cacheda, Fidel
    WEB INFORMATION SYSTEMS ENGINEERING - WISE 2007, PROCEEDINGS, 2007, 4831 : 212 - 224
  • [49] densityCut: an efficient and versatile topological approach for automatic clustering of biological data
    Ding, Jiarui
    Shah, Sohrab
    Condon, Anne
    BIOINFORMATICS, 2016, 32 (17) : 2567 - 2576
  • [50] Automatic Clustering of EEG-Based Data Associated with Brain Activity
    Kurowski, Adam
    Mrozik, Katarzyna
    Kostek, Bozena
    Czyzewski, Andrzej
    MULTIMEDIA AND NETWORK INFORMATION SYSTEMS, 2019, 833 : 470 - 479