MaPle:: A fast algorithm for maximal pattern-based clustering

被引:0
|
作者
Pei, J [1 ]
Zhang, XL [1 ]
Cho, MJ [1 ]
Wang, HX [1 ]
Yu, PS [1 ]
机构
[1] SUNY Buffalo, Buffalo, NY 14260 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pattern-based clustering is important in many applications, such as DNA micro-array data analysis, automatic recommendation systems and target marketing systems. However pattern-based clustering in large databases is challenging. On the one hand, there can be a huge number of clusters and many of them can be redundant and thus make the pattern-based clustering ineffective. On the other hand, the previous proposed methods may not be efficient or scalable in mining large databases. In this paper, we study the problem of maximal pattern-based clustering. Redundant clusters are avoided completely by mining only the maximal pattern-based clusters. MaPle, an efficient and scalable mining algorithm is developed. It conducts a depth-first, divide-and-conquer search and prunes unnecessary branches smartly. Our extensive performance study on both synthetic data sets and real data sets shows that maximal pattern-based clustering is effective. It reduces the number of clusters substantially. Moreover MaPle is more efficient and scalable than the previously proposed pattern-based clustering methods in mining large databases.
引用
收藏
页码:259 / 266
页数:8
相关论文
共 50 条
  • [21] A pattern-based clustering strategy for object-oriented databases
    Chen, YH
    Lai, JK
    Lee, C
    INFORMATION INTELLIGENCE AND SYSTEMS, VOLS 1-4, 1996, : 971 - 976
  • [22] Incremental Hierarchical Clustering of Stochastic Pattern-Based Symbolic Data
    Xu, Xin
    Lu, Jiaheng
    Wang, Wei
    ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2016, PT II, 2016, 9652 : 156 - 167
  • [23] A fuzzy pattern-based filtering algorithm for botnet detection
    Wang, Kuochen
    Huang, Chun-Ying
    Lin, Shang-Jyh
    Lin, Ying-Dar
    COMPUTER NETWORKS, 2011, 55 (15) : 3275 - 3286
  • [24] Formal verification of the value pattern-based translation algorithm
    Kim, Jinhyung
    Jeong, Dongwon
    Baik, Doo-Kwon
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2007, 14 : 1359 - 1363
  • [25] A pattern-based fault classification algorithm for distribution transformers
    Jeong, SC
    Kim, JW
    Park, PG
    Kim, SW
    IEEE TRANSACTIONS ON POWER DELIVERY, 2005, 20 (04) : 2483 - 2492
  • [26] A clustering algorithm based on maximal θ-distant subtrees
    Li Yujian
    PATTERN RECOGNITION, 2007, 40 (05) : 1425 - 1431
  • [27] Fast algorithm for mining association rules based on equivalence class and maximal hypergraph clique clustering
    Wang, Xiang
    Yuan, Zhaoshan
    Xiaoxing Weixing Jisuanji Xitong/Mini-Micro Systems, 2000, 21 (06): : 614 - 616
  • [28] Fast pattern-based throughput prediction for TCP bulk transfers
    Huang, TI
    Subhlok, J
    2005 IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, VOLS 1 AND 2, 2005, : 410 - 417
  • [29] An intelligent technique for pattern-based clustering of continuous-valued datasets
    Dhull, Anuradha
    Singh, Akansha
    Singh, Krishna Kant
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2022, 25 (05): : 3231 - 3248
  • [30] An intelligent technique for pattern-based clustering of continuous-valued datasets
    Anuradha Dhull
    Akansha Singh
    Krishna Kant Singh
    Cluster Computing, 2022, 25 : 3231 - 3248