Clustering item data sets with association-taxonomy similarity

被引:1
|
作者
Yun, CH [1 ]
Chuang, KT [1 ]
Chen, MS [1 ]
机构
[1] Natl Taiwan Univ, Dept Elect Engn, Grad Inst Commun Engn, Taipei, Taiwan
关键词
D O I
10.1109/ICDM.2003.1251011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We explore in this paper the efficient clustering of item data. Different from those of the traditional data, the features of item data are known to be of high dimensionality and sparsity. In view of the features of item data, we devise in this paper a novel measurement, called the association-taxonomy similarity, and utilize this measurement to perform the clustering. With this association-taxonomy similarity measurement, we develop an efficient clustering algorithm, called algorithm AT (standing for Association-Taxonomy), for item data. Two validation indexes based on association and taxonomy properties are also devised to assess the quality of clustering for item data. As validated by the real dataset, it is shown by our experimental results that algorithm AT devised in this paper significantly outperforms the prior works in the clustering quality as measured by the validation indexes, indicating the usefulness of association-taxonomy similarity in item data clustering.
引用
收藏
页码:697 / 700
页数:4
相关论文
共 50 条
  • [1] Measuring Similarity of Complex and Heterogeneous Data in Clustering of Large Data Sets
    Bacelar-Nicolau, Helena
    Nicolau, Fernando
    Sousa, Aurga
    Bacelar-Nicolau, Leonor
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2009, 29 (02) : 9 - 18
  • [2] Determination of similarity threshold in clustering problems for large data sets
    Sánchez-Díaz, G
    Martínez-Trinidad, JF
    PROGRESS IN PATTERN RECOGNITION, SPEECH AND IMAGE ANALYSIS, 2003, 2905 : 611 - 618
  • [3] Estimating Sequence Similarity from Read Sets for Clustering Sequencing Data
    Rysavy, Petr
    Zelezny, Filip
    ADVANCES IN INTELLIGENT DATA ANALYSIS XV, 2016, 9897 : 204 - 214
  • [4] A clustering with slope algorithm based on item similarity
    Wu Huiyun
    Wang Yuping
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (04) : 2177 - 2185
  • [5] Clustering of Complex Data-sets using Fractal Similarity Measures and Uncertainties
    Hoecker, Maximilian
    Polsterer, Kai Lars
    Kuegler, Sven Dennis
    Heuveline, Vincent
    2015 IEEE 18TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE), 2015, : 82 - 91
  • [6] An innovative clustering approach utilizing frequent item sets
    Manzali Y.
    Barry K.A.
    Flouchi R.
    Balouki Y.
    Elfar M.
    Multimedia Tools and Applications, 2025, 84 (10) : 7835 - 7861
  • [7] ClusTrack: Feature Extraction and Similarity Measures for Clustering of Genome-Wide Data Sets
    Rydbeck, Halfdan
    Sandve, Geir Kjetil
    Ferkingstad, Egil
    Simovski, Boris
    Rye, Morten
    Hovig, Eivind
    PLOS ONE, 2015, 10 (04):
  • [8] Experiments on Rough Sets Clustering with Various Similarity Measures
    Szederjesi-Dragomir, Arnold
    Gaceanu, Radu D.
    Pop, Horia F.
    Sarbu, Costel
    IPSI BGD TRANSACTIONS ON INTERNET RESEARCH, 2020, 16 (02): : 75 - 83
  • [9] A Comparison Study of Similarity Measures in Rough Sets Clustering
    Szederjesi-Dragomir, Arnold
    Gaceanu, Radu D.
    Pop, Horia F.
    Sarbu, Costel
    2019 IEEE 15TH INTERNATIONAL SCIENTIFIC CONFERENCE ON INFORMATICS (INFORMATICS 2019), 2019, : 37 - 42
  • [10] BAYESIAN CLUSTERING OF DATA SETS
    MENZEFRICKE, U
    COMMUNICATIONS IN STATISTICS PART A-THEORY AND METHODS, 1981, 10 (01): : 65 - 77