Equi-Clustream: a framework for clustering time evolving mixed data

被引:6
|
作者
Sangam, Ravi Sankar [1 ]
Om, Hari [2 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Tadepalligudem 534101, Andhra Prades, India
[2] Indian Inst Technol, Dept Comp Sci & Engn, Indian Sch Mines, Dhanbad 826004, Jharkhand, India
关键词
Clustering; Data streams; Time-evolving data; Data mining; DATA STREAMS; ALGORITHM;
D O I
10.1007/s11634-018-0316-3
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In data stream environment, most of the conventional clustering algorithms are not sufficiently efficient, since large volumes of data arrive in a stream and these data points unfold with time. The problem of clustering time-evolving metric data and categorical time-evolving data has separately been well explored in recent years, but the problem of clustering mixed type time-evolving data remains a challenging issue due to an awkward gap between the structure of metric and categorical attributes. In this paper, we devise a generalized framework, termed Equi-Clustream to dynamically cluster mixed type time-evolving data, which comprises three algorithms: a Hybrid Drifting Concept Detection Algorithm that detects the drifting concept between the current sliding window and previous sliding window, a Hybrid Data Labeling Algorithm that assigns an appropriate cluster label to each data vector of the current non-drifting window based on the clustering result of the previous sliding window, and a visualization algorithm that analyses the relationship between the clusters at different timestamps and also visualizes the evolving trends of the clusters. The efficacy of the proposed framework is shown by experiments on synthetic and real world datasets.
引用
收藏
页码:973 / 995
页数:23
相关论文
共 50 条
  • [1] Equi-Clustream: a framework for clustering time evolving mixed data
    Ravi Sankar Sangam
    Hari Om
    Advances in Data Analysis and Classification, 2018, 12 : 973 - 995
  • [2] A Framework for Clustering Categorical Time-Evolving Data
    Cao, Fuyuan
    Liang, Jiye
    Bai, Liang
    Zhao, Xingwang
    Dang, Chuangyin
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2010, 18 (05) : 872 - 882
  • [3] Deep Embedded Clustering Framework for Mixed Data
    Lee, Yonggu
    Park, Chulwung
    Kang, Shinjin
    IEEE ACCESS, 2023, 11 : 33 - 40
  • [4] Probabilistic clustering of time-evolving distance data
    Julia E. Vogt
    Marius Kloft
    Stefan Stark
    Sudhir S. Raman
    Sandhya Prabhakaran
    Volker Roth
    Gunnar Rätsch
    Machine Learning, 2015, 100 : 635 - 654
  • [5] Probabilistic clustering of time-evolving distance data
    Vogt, Julia E.
    Kloft, Marius
    Stark, Stefan
    Raman, Sudhir S.
    Prabhakaran, Sandhya
    Roth, Volker
    Raetsch, Gunnar
    MACHINE LEARNING, 2015, 100 (2-3) : 635 - 654
  • [6] Clustering of Mixed Data by Integrating Fuzzy, Probabilistic, and Collaborative Clustering Framework
    Pathak, Arkanath
    Pal, Nikhil R.
    INTERNATIONAL JOURNAL OF FUZZY SYSTEMS, 2016, 18 (03) : 339 - 348
  • [7] Clustering of Mixed Data by Integrating Fuzzy, Probabilistic, and Collaborative Clustering Framework
    Arkanath Pathak
    Nikhil R. Pal
    International Journal of Fuzzy Systems, 2016, 18 : 339 - 348
  • [8] An Improved Clustream Clustering Algorithm for Anomaly Detection in Electric Power Big Data
    Wang, Yanming
    Engineering Intelligent Systems, 2022, 30 (03): : 185 - 193
  • [9] A Framework for Outlier Detection in Evolving Data Streams by Weighting Attributes in Clustering
    Yogita
    Toshniwal, Durga
    2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, COMPUTING & SECURITY [ICCCS-2012], 2012, 1 : 214 - 222
  • [10] Time-sensitive clustering evolving textual data streams
    Ammar, Mohamed
    Hidri, Adel
    Sassi Hidri, Minyar
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2020, 63 (1-2) : 25 - 40