A Roughset Based Data Labeling Method for Clustering Categorical Data

被引:0
|
作者
Reddy, H. Venkateswara [1 ]
Raju, S. Viswanadha [2 ]
机构
[1] Vardhaman Coll Engn, Dept Comp Sci & Engn, Hyderabad, Andhra Pradesh, India
[2] JNTUH Coll Engn, Dept Comp Sci & Engn, Nachupally, Karimnagar, India
关键词
Clustering; Data labeling; Categorical Data; Outlier; Rough Sets; Entropy;
D O I
10.1109/Eco-friendly.2014.86
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Data mining presets the process of finding analytical accounts in huge databases. Clustering is a one of efficient technique in data mining and it is performed based on the principle of similarity. Clustering the large database is a demanding and time consuming task. For this reason, an approach called data labeling through sampling technique is used. Data labeling is process of clustering the un sampled data objects into appropriate clusters. In this approach clustering the data is easy and also it improves the efficiency of clustering. In this method initially a sample dataset is chosen from a large database for clustering when initial clustering is completed, and the unsampled data objects are compared with the presented clusters. As a result, the similar data objects are given proper clustered labels and the dissimilar ones are treated as outliers. These data labeling methods are easier to execute on the numerical data, but it is complicated task for the categorical data because the distance among data objects does not exist. In the proposed method, a new and efficient data labeling technique is used to cluster the categorical data based on the cluster entropy in rough set theory. It is shown through the experimental results that the proposed algorithm is efficient and produces high quality clusters than previous clustering methods.
引用
收藏
页码:51 / 55
页数:5
相关论文
共 50 条
  • [1] A data labeling method for clustering categorical data
    Cao, Fuyuan
    Liang, Jiye
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (03) : 2381 - 2385
  • [2] Data Labeling method based on Rough Entropy for Categorical Data Clustering
    Sreenivasulu, G.
    Raju, S. Viswanadha
    Rao, N. Sambasiva
    [J]. 2014 INTERNATIONAL CONFERENCE ON ELECTRONICS, COMMUNICATION AND COMPUTATIONAL ENGINEERING (ICECCE), 2014, : 173 - 178
  • [3] On data labeling for clustering categorical data
    Chen, Hung-Leng
    Chuang, Kun-Ta
    Chen, Ming-Syan
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (11) : 1458 - 1471
  • [4] Data Labeling method based on Cluster Purity using Relative Rough Entropy for Categorical Data Clustering
    Reddy, H. Venkateswara
    Raju, S. Viswanadha
    Agrawal, Pratibha
    [J]. 2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 500 - 506
  • [5] A Data Labeling method for Categorical Data Clustering using Cluster Entropies in Rough Sets
    Reddy, H. Venkateswara
    Kumar, B. Suresh
    Raju, S. Viswanadha
    [J]. 2014 FOURTH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORK TECHNOLOGIES (CSNT), 2014, : 444 - 449
  • [6] Data Reduction Method for Categorical Data Clustering
    Rendon, Erendira
    Salvador Sanchez, J.
    Garcia, Rene A.
    Abundez, Itzel
    Gutierrez, Citlalih
    Gasca, Eduardo
    [J]. ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2008, PROCEEDINGS, 2008, 5290 : 143 - +
  • [7] A Clustering Method for Categorical Ordinal Data
    Giordan, Marco
    Diana, Giancarlo
    [J]. COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2011, 40 (07) : 1315 - 1334
  • [8] Clustering Categorical Data Using a Swarm-based Method
    Izakian, Hesam
    Abraham, Ajith
    Snasel, Vaclav
    [J]. 2009 WORLD CONGRESS ON NATURE & BIOLOGICALLY INSPIRED COMPUTING (NABIC 2009), 2009, : 1719 - +
  • [9] Clustering Categorical Data Based on Representatives
    Aranganayagi, S.
    Thangavel, K.
    [J]. THIRD 2008 INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, VOL 1, PROCEEDINGS, 2008, : 599 - +
  • [10] Efficiency Based Categorical Data Clustering
    Kalaivani, K.
    Raghavendra, A. P. V.
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2012, : 550 - 553