Multi-step density-based clustering

被引:0
|
作者
Stefan Brecheisen
Hans-Peter Kriegel
Martin Pfeifle
机构
[1] University of Munich,Institute for Informatics
来源
关键词
Approximated clustering; Complex objects; Data mining; Density-based clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Data mining in large databases of complex objects from scientific, engineering or multimedia applications is getting more and more important. In many areas, complex distance measures are first choice but also simpler distance functions are available which can be computed much more efficiently. In this paper, we will demonstrate how the paradigm of multi-step query processing which relies on exact as well as on lower-bounding approximated distance functions can be integrated into the two density-based clustering algorithms DBSCAN and OPTICS resulting in a considerable efficiency boost. Our approach tries to confine itself to ɛ-range queries on the simple distance functions and carries out complex distance computations only at that stage of the clustering algorithm where they are compulsory to compute the correct clustering result. Furthermore, we will show how our approach can be used for approximated clustering allowing the user to find an individual trade-off between quality and efficiency. In order to assess the quality of the resulting clusterings, we introduce suitable quality measures which can be used generally for evaluating the quality of approximated partitioning and hierarchical clusterings. In a broad experimental evaluation based on real-world test data sets, we demonstrate that our approach accelerates the generation of exact density-based clusterings by more than one order of magnitude. Furthermore, we show that our approximated clustering approach results in high quality clusterings where the desired quality is scalable with respect to (w.r.t.) the overall number of exact distance computations.
引用
收藏
页码:284 / 308
页数:24
相关论文
共 50 条
  • [31] An ensemble density-based clustering method
    Xia, Luning
    Jing, Jiwu
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND KNOWLEDGE ENGINEERING (ISKE 2007), 2007,
  • [32] Hierarchical density-based clustering of shapes
    Gautama, T
    Van Hulle, MM
    NEURAL NETWORKS FOR SIGNAL PROCESSING XI, 2001, : 213 - 222
  • [33] Deep density-based image clustering
    Ren, Yazhou
    Wang, Ni
    Li, Mingxia
    Xu, Zenglin
    KNOWLEDGE-BASED SYSTEMS, 2020, 197
  • [34] Anytime parallel density-based clustering
    Son T. Mai
    Ira Assent
    Jon Jacobsen
    Martin Storgaard Dieu
    Data Mining and Knowledge Discovery, 2018, 32 : 1121 - 1176
  • [35] Density-based clustering with topographic maps
    Van Hulle, MM
    IEEE TRANSACTIONS ON NEURAL NETWORKS, 1999, 10 (01): : 204 - 207
  • [36] Density-based clustering of social networks
    Menardi, Giovanna
    De Stefano, Domenico
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2022, 185 (03) : 1004 - 1029
  • [37] Scalable density-based distributed clustering
    Januzaj, E
    Kriegel, HP
    Pfeifle, M
    KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2004, PROCEEDINGS, 2004, 3202 : 231 - 244
  • [38] DBHD: Density-based clustering for highly varying density
    Durani, Walid
    Mautz, Dominik
    Plant, Claudia
    Boehm, Christian
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 921 - 926
  • [39] Fast density estimation for density-based clustering methods
    Cheng, Difei
    Xu, Ruihang
    Zhang, Bo
    Jin, Ruinan
    NEUROCOMPUTING, 2023, 532 : 170 - 182
  • [40] Incremental Density-Based Clustering on Multicore Processors
    Mai, Son T.
    Jacobsen, Jon
    Amer-Yahia, Sihem
    Spence, Ivor
    Nhat-Phuong Tran
    Assent, Ira
    Quoc Viet Hung Nguyen
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (03) : 1338 - 1356