Discovering pattern-based subspace clusters by pattern tree

被引:10
|
作者
Guan, Jihong [1 ]
Gan, Yanglan [1 ]
Wang, Hao [2 ]
机构
[1] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
[2] Hefei Univ Technol, Dept Comp Sci & Technol, Hefei 23009, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering analysis; Subspace clustering; Pattern similarity; Pattern tree;
D O I
10.1016/j.knosys.2009.02.011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traditional clustering models based on distance similarity are not always effective in capturing correlation among data objects, while pattern-based clustering can do well in identifying correlation hidden among data objects. However, the state-of-the-art pattern-based clustering methods are inefficient and provide no metric to measure the clustering quality. This paper presents a new pattern-based subspace clustering method, which can tackle the problems mentioned above. Observing the analogy between mining frequent itemsets and discovering subspace clusters, we apply pattern tree - a structure used in frequent itemsets mining to determining the target subspaces by scanning the database once, which can be done efficiently in large datasets. Furthermore, we introduce a general clustering quality evaluation model to guide the identifying of meaningful clusters. The proposed new method enables the users to set flexibly proper quality-control parameters to meet different needs. Experimental results on synthetic and real datasets show that our method outperforms the existing methods in both efficiency and effectiveness. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:569 / 579
页数:11
相关论文
共 50 条
  • [1] Discovering Pattern-Based Mediator Services from Communication Logs
    Gierds, Christian
    Fahland, Dirk
    SERVICE-ORIENTED COMPUTING - ICSOC 2013 WORKSHOPS, 2014, 8377 : 123 - 134
  • [2] TruePIE: Discovering Reliable Patterns in Pattern-Based Information Extraction
    Li, Qi
    Jiang, Meng
    Zhang, Xikun
    Qu, Meng
    Hanratty, Timothy
    Gao, Jing
    Han, Jiawei
    KDD'18: PROCEEDINGS OF THE 24TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2018, : 1675 - 1684
  • [3] Pattern-Based Phylogenetic Distance Estimation and Tree Reconstruction
    Hoehl, Michael
    Rigoutsos, Isidore
    Ragan, Mark A.
    EVOLUTIONARY BIOINFORMATICS, 2006, 2 : 359 - 375
  • [4] Detecting Clusters in Moderate-to-High Dimensional Data: Subspace Clustering, Pattern-based Clustering, and Correlation Clustering
    Kriegel, Hans-Peter
    Kroeger, Peer
    Zimek, Arthur
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2008, 1 (02): : 1528 - 1529
  • [5] Discovering pattern-based dynamic structures from versions of unordered XML documents
    Zhao, Q
    Bhowmick, SS
    Madria, S
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2004, 3181 : 77 - 86
  • [6] Pattern-Based Mapping Refinement
    Hamdi, Faycal
    Reynaud, Chantal
    Safar, Brigitte
    KNOWLEDGE ENGINEERING AND MANAGEMENT BY THE MASSES, EKAW 2010, 2010, 6317 : 1 - 15
  • [7] Pattern-based texture metamorphosis
    Liu, ZQ
    Liu, C
    Shum, HY
    Yul, YZ
    10TH PACIFIC CONFERENCE ON COMPUTER GRAPHICS AND APPLICATIONS, PROCEEDINGS, 2002, : 184 - 191
  • [8] Pattern-based verification for trees
    Ceska, Milan
    Erlebach, Pavel
    Vojnar, Tomas
    COMPUTER AIDED SYSTEMS THEORY- EUROCAST 2007, 2007, 4739 : 488 - 496
  • [9] Pattern-based data compression
    Kuri, A
    Galaviz, J
    MICAI 2004: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2004, 2972 : 1 - 10
  • [10] Pattern-based Rule Disambiguation
    Zheng, Jie
    Cheng, Gang
    Li, Shoushan
    Kong, Fang
    Huang, Chu-Ren
    Zhou, Guodong
    2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 1444 - 1449