New Distance Measure based on the Domain for Categorical Data

被引:0
|
作者
Aranganayagi, S. [1 ]
Thangavel, K. [2 ]
Sujatha, S. [3 ]
机构
[1] JKK Nataraja Coll Arts & Sci, Komarapalayam 638183, Tamil Nadu, India
[2] Periyar Univ, Dept Comp Sci, Salem 636011, Tamil Nadu, India
[3] Kongu Engn Coll, Dept Comp Applicat, Perundurai 638052, Tamil Nadu, India
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Clustering the process of grouping homogeneous objects is an important data mining process. Few algorithms exist to cluster categorical data. K-Modes is the scalable and efficient algorithm to cluster the categorical data. In this paper we propose a new distance measure for K-Modes based on the cardinality of domain of attribute. The proposed method is experimented with data sets obtained from UCI data repository. Results prove that the proposed measure generates better clusters than the K-Modes algorithm.
引用
收藏
页码:93 / +
页数:2
相关论文
共 50 条
  • [1] Robust distance measure to detect outliers for categorical data
    T. P. Sripriya
    M. R. Srinivasan
    M. Gallo
    [J]. Soft Computing, 2020, 24 : 13557 - 13564
  • [2] Robust distance measure to detect outliers for categorical data
    Sripriya, T. P.
    Srinivasan, M. R.
    Gallo, M.
    [J]. SOFT COMPUTING, 2020, 24 (18) : 13557 - 13564
  • [3] Density-based clustering algorithm for numerical and categorical data with mixed distance measure methods
    Chen, Jin-Yin
    He, Hui-Hao
    [J]. Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2015, 32 (08): : 993 - 1002
  • [4] Clustering categorical data based on distance vectors
    Zhang, P
    Wang, XG
    Song, PXK
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) : 355 - 367
  • [5] A New Distance Metric for Unsupervised Learning of Categorical Data
    Jia, Hong
    Cheung, Yiu-ming
    Liu, Jiming
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (05) : 1065 - 1079
  • [6] A New Distance Metric for Unsupervised Learning of Categorical Data
    Jia, Hong
    Cheung, Yiu-ming
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1893 - 1899
  • [7] An association-based dissimilarity measure for categorical data
    Le, SQ
    Ho, TB
    [J]. PATTERN RECOGNITION LETTERS, 2005, 26 (16) : 2549 - 2557
  • [8] Classifying Categorical Data Based on Adoptive Hamming Distance
    Lee, Jae-Sung
    Kim, Dae-Won
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (01): : 189 - 192
  • [9] Context-Based Distance Learning for Categorical Data Clustering
    Ienco, Dino
    Pensa, Ruggero G.
    Meo, Rosa
    [J]. ADVANCES IN INTELLIGENT DATA ANALYSIS VIII, PROCEEDINGS, 2009, 5772 : 83 - 94
  • [10] An efficient entropy based dissimilarity measure to cluster categorical data
    Kar, Amit Kumar
    Mishra, Amaresh Chandra
    Mohanty, Sraban Kumar
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 119