MaPle:: A fast algorithm for maximal pattern-based clustering

被引:0
|
作者
Pei, J [1 ]
Zhang, XL [1 ]
Cho, MJ [1 ]
Wang, HX [1 ]
Yu, PS [1 ]
机构
[1] SUNY Buffalo, Buffalo, NY 14260 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pattern-based clustering is important in many applications, such as DNA micro-array data analysis, automatic recommendation systems and target marketing systems. However pattern-based clustering in large databases is challenging. On the one hand, there can be a huge number of clusters and many of them can be redundant and thus make the pattern-based clustering ineffective. On the other hand, the previous proposed methods may not be efficient or scalable in mining large databases. In this paper, we study the problem of maximal pattern-based clustering. Redundant clusters are avoided completely by mining only the maximal pattern-based clusters. MaPle, an efficient and scalable mining algorithm is developed. It conducts a depth-first, divide-and-conquer search and prunes unnecessary branches smartly. Our extensive performance study on both synthetic data sets and real data sets shows that maximal pattern-based clustering is effective. It reduces the number of clusters substantially. Moreover MaPle is more efficient and scalable than the previously proposed pattern-based clustering methods in mining large databases.
引用
收藏
页码:259 / 266
页数:8
相关论文
共 50 条
  • [31] Comparison of pattern-based and algorithm-based approaches to tautomer informatics
    Ellingson, Ben
    Tolbert, Robert
    Skillman, A. Geoffrey
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2010, 240
  • [32] PATTERN-BASED TRILATERATION POSITIONING ALGORITHM WITH LOW COMPUTING COST
    Shi, Yong
    Chu, Zhaoling
    Qu, Yan
    Bian, Guiyang
    UPB Scientific Bulletin, Series C: Electrical Engineering and Computer Science, 2023, 85 (03): : 263 - 272
  • [33] PATTERN-BASED TRILATERATION POSITIONING ALGORITHM WITH LOW COMPUTING COST
    Shi, Yong
    Chu, Zhaoling
    Qu, Yan
    Bian, Guiyang
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2023, 85 (03): : 263 - 272
  • [34] An Information Hiding Scheme using a Pattern-based Compression Algorithm
    Keissarian, Farhad
    SITIS 2007: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL IMAGE TECHNOLOGIES & INTERNET BASED SYSTEMS, 2008, : 948 - 955
  • [35] The Clustering Algorithm of Query Result based on Maximal Frequent
    Wei Yu-wei
    CEIS 2011, 2011, 15
  • [36] Detecting Clusters in Moderate-to-High Dimensional Data: Subspace Clustering, Pattern-based Clustering, and Correlation Clustering
    Kriegel, Hans-Peter
    Kroeger, Peer
    Zimek, Arthur
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02): : 1528 - 1529
  • [37] A Time-stamp Frequent Pattern-based Clustering Method for Anomaly Detection
    Hu, Liang
    Nurbol
    Liu, Zhiyu
    He, Jinshan
    Zhao, Kuo
    IETE TECHNICAL REVIEW, 2010, 27 (03) : 220 - 227
  • [38] PBC: A Software Framework Facilitating Pattern-Based Clustering for Microarray Data Analysis
    Shin, Dong-Guk
    Hong, Seung-Hyun
    Joshi, Pujan
    Nori, Ravi
    Pei, Baikang
    Wang, Hsin-Wei
    Harrington, Patrick
    Kuo, Lynn
    Kalajzic, Ivo
    Rowe, David
    2009 INTERNATIONAL JOINT CONFERENCE ON BIOINFORMATICS, SYSTEMS BIOLOGY AND INTELLIGENT COMPUTING, PROCEEDINGS, 2009, : 30 - +
  • [39] A pattern-based routing algorithm for a novel electronic system prototyping platform
    Lepercq, Etienne
    Blaquiere, Yves
    Savaria, Yvon
    INTEGRATION-THE VLSI JOURNAL, 2018, 62 : 224 - 237
  • [40] Improving the Accuracy of the Annotation Algorithm in Pattern-Based Tennis Game Video
    Bastanfard, Azam
    Amirkhani, Dariush
    2021 29TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE), 2021, : 493 - 497