Clustering item data sets with association-taxonomy similarity

被引:1
|
作者
Yun, CH [1 ]
Chuang, KT [1 ]
Chen, MS [1 ]
机构
[1] Natl Taiwan Univ, Dept Elect Engn, Grad Inst Commun Engn, Taipei, Taiwan
关键词
D O I
10.1109/ICDM.2003.1251011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We explore in this paper the efficient clustering of item data. Different from those of the traditional data, the features of item data are known to be of high dimensionality and sparsity. In view of the features of item data, we devise in this paper a novel measurement, called the association-taxonomy similarity, and utilize this measurement to perform the clustering. With this association-taxonomy similarity measurement, we develop an efficient clustering algorithm, called algorithm AT (standing for Association-Taxonomy), for item data. Two validation indexes based on association and taxonomy properties are also devised to assess the quality of clustering for item data. As validated by the real dataset, it is shown by our experimental results that algorithm AT devised in this paper significantly outperforms the prior works in the clustering quality as measured by the validation indexes, indicating the usefulness of association-taxonomy similarity in item data clustering.
引用
收藏
页码:697 / 700
页数:4
相关论文
共 50 条
  • [31] An Improved Association Rules Algorithm based on Frequent Item Sets
    Jiang, Yaqiong
    Wang, Jun
    CEIS 2011, 2011, 15
  • [32] Similarity Measures Recommendation for Mixed Data Clustering
    Diop, Abdoulaye
    El Malki, Nabil
    Chevalier, Max
    Peninou, Andre
    Teste, Olivier
    Jimenez, Geoffrey Roman
    SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT 36TH INTERNATIONAL CONFERENCE, SSDBM 2024, 2024,
  • [33] Generalized Similarity Measure for Categorical Data Clustering
    Sharma, Shruti
    Singh, Manoj
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 765 - 769
  • [34] Data clustering using efficient similarity measures
    Bisandu, Desmond Bala
    Prasad, Rajesh
    Liman, Musa Muhammad
    JOURNAL OF STATISTICS AND MANAGEMENT SYSTEMS, 2019, 22 (05) : 901 - 922
  • [35] Taxonomy, distribution and trait data sets of Japanese Collembola
    Hishi, Takuo
    Fujii, Saori
    Saitoh, Seikoh
    Yoshida, Tomohiro
    Hasegawa, Motohiro
    ECOLOGICAL RESEARCH, 2019, 34 (04) : 444 - 445
  • [36] Threshold Based Similarity Clustering of Medical Data
    Morajkar, Sweta C.
    Laxminarayani, J. A.
    2014 INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION CONTROL AND COMPUTING TECHNOLOGIES (ICACCCT), 2014, : 591 - 595
  • [37] A Similarity Measure for Clustering Gene Expression Data
    Baishya, Ram Charan
    Sarmah, Rosy
    Bhattacharyya, Dhruba Kumar
    Dutta, Malay Ananda
    APPLIED ALGORITHMS, 2014, 8321 : 245 - 256
  • [38] A Similarity Measure For Atanassov Intuitionistic Fuzzy Sets and its Application to Clustering
    Khan, Mohd Shoaib
    Lohani, Q. M. Danish
    2016 INTERNATIONAL WORKSHOP ON COMPUTATIONAL INTELLIGENCE (IWCI), 2016, : 232 - 239
  • [39] Similarity Clustering for Representative Sets of Inorganic Solids for Density Functional Testing
    Kovacs, Peter
    Tran, Fabien
    Hanbury, Allan
    Madsen, Georg K. H.
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2022, 18 (01) : 441 - 447
  • [40] Clustering Algorithm Based on Time Series Similarity to Web Data Clustering
    Yang Yan
    Yao Hua-Xiong
    Li Rong
    PROCEEDINGS OF THE 2015 4TH NATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND COMPUTER ENGINEERING ( NCEECE 2015), 2016, 47 : 1373 - 1377