Multi-step density-based clustering

被引:0
|
作者
Stefan Brecheisen
Hans-Peter Kriegel
Martin Pfeifle
机构
[1] University of Munich,Institute for Informatics
来源
关键词
Approximated clustering; Complex objects; Data mining; Density-based clustering;
D O I
暂无
中图分类号
学科分类号
摘要
Data mining in large databases of complex objects from scientific, engineering or multimedia applications is getting more and more important. In many areas, complex distance measures are first choice but also simpler distance functions are available which can be computed much more efficiently. In this paper, we will demonstrate how the paradigm of multi-step query processing which relies on exact as well as on lower-bounding approximated distance functions can be integrated into the two density-based clustering algorithms DBSCAN and OPTICS resulting in a considerable efficiency boost. Our approach tries to confine itself to ɛ-range queries on the simple distance functions and carries out complex distance computations only at that stage of the clustering algorithm where they are compulsory to compute the correct clustering result. Furthermore, we will show how our approach can be used for approximated clustering allowing the user to find an individual trade-off between quality and efficiency. In order to assess the quality of the resulting clusterings, we introduce suitable quality measures which can be used generally for evaluating the quality of approximated partitioning and hierarchical clusterings. In a broad experimental evaluation based on real-world test data sets, we demonstrate that our approach accelerates the generation of exact density-based clusterings by more than one order of magnitude. Furthermore, we show that our approximated clustering approach results in high quality clusterings where the desired quality is scalable with respect to (w.r.t.) the overall number of exact distance computations.
引用
收藏
页码:284 / 308
页数:24
相关论文
共 50 条
  • [41] Clustering with Missing Features: A Density-Based Approach
    Gao, Kun
    Khan, Hassan Ali
    Qu, Wenwen
    SYMMETRY-BASEL, 2022, 14 (01):
  • [42] An Efficient Density-Based Algorithm for Data Clustering
    Theljani, Foued
    Laabidi, Kaouther
    Zidi, Salah
    Ksouri, Moufida
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2017, 26 (04)
  • [43] Anytime density-based clustering of complex data
    Son T. Mai
    Xiao He
    Jing Feng
    Claudia Plant
    Christian Böhm
    Knowledge and Information Systems, 2015, 45 : 319 - 355
  • [44] TOBAE: A Density-based Agglomerative Clustering Algorithm
    Shehzad Khalid
    Shahid Razzaq
    Journal of Classification, 2015, 32 : 241 - 267
  • [45] Geometric algorithms for density-based data clustering
    Chen, DZ
    Smid, M
    Xu, B
    ALGORITHMS-ESA 2002, PROCEEDINGS, 2002, 2461 : 284 - 296
  • [46] Performance evaluation of density-based clustering methods
    Aliguliyev, Ramiz M.
    INFORMATION SCIENCES, 2009, 179 (20) : 3583 - 3602
  • [47] Density-based clustering for exploration of analytical data
    Daszykowski, M
    Walczak, B
    Massart, DL
    ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2004, 380 (03) : 370 - 372
  • [48] The SpectACl of Nonconvex Clustering: A Spectral Approach to Density-Based Clustering
    Hess, Sibylle
    Duivesteijn, Wouter
    Honysz, Philipp
    Morik, Katharina
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3788 - 3795
  • [49] Share density-based clustering of income data
    Condino, Francesca
    STATISTICAL ANALYSIS AND DATA MINING, 2023, 16 (04) : 336 - 347
  • [50] Geometric algorithms for density-based data clustering
    Chen, DZ
    Smid, M
    Xu, B
    INTERNATIONAL JOURNAL OF COMPUTATIONAL GEOMETRY & APPLICATIONS, 2005, 15 (03) : 239 - 260