An Efficient Density Based Incremental Clustering Algorithm in Data Warehousing Environment

被引:0
|
作者
Goyal, Navneet [1 ]
Goyal, Poonam [1 ]
Venkatramaiah, K. [1 ]
Deepak, P. C. [1 ]
Sanoop, P. S. [1 ]
机构
[1] BITS, Dept Comp Sci & Informat Syst, Pilani 333031, Rajasthan, India
关键词
Incremental clustering; DBSCAN; Incremental DBSCAN;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data Warehouses are a good source of data for downstream data mining applications. New data arrives in data warehouses during the periodic refresh cycles. Appending of data on existing data requires that all patterns discovered earlier using various data mining algorithms are updated with each refresh. In this paper, we present an incremental density based clustering algorithm. Incremental DBSCAN is an existing incremental algorithm in which data can be added/deleted to/from existing clusters, one point at a time. Our algorithm is capable of adding points in bulk to existing set of clusters. In this new algorithm, the data points to be added are first clustered using the DBSCAN algorithm and then these new clusters are merged with existing clusters, to come up with the modified set of clusters. That is, we add the clusters incrementally rather than adding points incrementally. It is found that the proposed incremental clustering algorithm produces the same clusters as obtained by Incremental DBSCAN. We have used R*-trees as the data structure to hold the multidimensional data that we need to cluster. One of the major advantages of the proposed approach is that it allows us to see the clustering patterns of the new data along with the existing clustering patterns. Moreover, we can see the merged clusters as well. The proposed algorithm is capable of considerable savings, in terms of region queries performed, as compared to incremental DBSCAN. Results are presented to support the claim.
引用
收藏
页码:556 / 560
页数:5
相关论文
共 50 条
  • [41] Efficient density and cluster based incremental outlier detection in data streams
    Degirmenci, Ali
    Karal, Omer
    [J]. INFORMATION SCIENCES, 2022, 607 : 901 - 920
  • [42] Data integration algorithm for Data Warehousing based on ontologies metadata
    Salguero, Alberto
    Araque, Francisco
    Delgado, Cecilia
    [J]. COMPUTATIONAL INTELLIGENCE IN DECISION AND CONTROL, 2008, 1 : 175 - 180
  • [43] Incremental Clustering Algorithm for Earth Science Data Mining
    Vatsavi, Ranga Raju
    [J]. COMPUTATIONAL SCIENCE - ICCS 2009, 2009, 5545 : 375 - 384
  • [44] A density-based clustering algorithm for the CYGNO data analysis
    Baracchini, E.
    Benussi, L.
    Bianco, S.
    Capoccia, C.
    Caponero, M.
    Cavoto, G.
    Cortez, A.
    Costa, I. A.
    Di Marco, E.
    D'Imperio, G.
    Dho, G.
    Lacoangeli, F.
    Maccarrone, G.
    Marafini, M.
    Mazzitelli, G.
    Messina, A.
    Nobrega, R. A.
    Orlandi, A.
    Paoletti, E.
    Passamonti, L.
    Petrucci, F.
    Piccolo, D.
    Pierluigi, D.
    Pinci, D.
    Renga, F.
    Rosatelli, F.
    Russo, A.
    Saviano, G.
    Tesauroc, R.
    Tomassini, S.
    [J]. JOURNAL OF INSTRUMENTATION, 2020, 15 (12):
  • [45] Efficient layered density-based clustering of categorical data
    Andreopoulos, Bill
    An, Aijun
    Wang, Xiaogang
    Labudde, Dirk
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (02) : 365 - 376
  • [46] Split incremental clustering algorithm of mixed data stream
    Siwar Gorrab
    Fahmi Ben Rejab
    Kaouther Nouira
    [J]. Progress in Artificial Intelligence, 2024, 13 : 51 - 64
  • [47] A New Density Based Clustering Algorithm for Binary Data Sets
    Nanda, Satyasai Jagannath
    Raman, Rahul
    Vijay, Shubham
    Bhardwaj, Anil
    [J]. 2014 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND APPLICATIONS (ICHPCA), 2014,
  • [48] PBIRCH: A scalable parallel clustering algorithm for incremental data
    Garg, Ashwani
    Mangla, Ashish
    Gupta, Neelima
    Bhatnagar, Vasudha
    [J]. 10TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 2006, : 315 - +
  • [49] A Density Granularity Grid Clustering Algorithm Based on Data Stream
    Wang, Li-fang
    Han, Xie
    [J]. EMERGING RESEARCH IN WEB INFORMATION SYSTEMS AND MINING, 2011, 238 : 113 - 120
  • [50] A Data Stream Clustering Algorithm Based on Density and Extended Grid
    Hua, Zheng
    Du, Tao
    Qu, Shouning
    Mou, Guodong
    [J]. INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2017, PT II, 2017, 10362 : 689 - 699