Equi-Clustream: a framework for clustering time evolving mixed data

被引:6
|
作者
Sangam, Ravi Sankar [1 ]
Om, Hari [2 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Tadepalligudem 534101, Andhra Prades, India
[2] Indian Inst Technol, Dept Comp Sci & Engn, Indian Sch Mines, Dhanbad 826004, Jharkhand, India
关键词
Clustering; Data streams; Time-evolving data; Data mining; DATA STREAMS; ALGORITHM;
D O I
10.1007/s11634-018-0316-3
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In data stream environment, most of the conventional clustering algorithms are not sufficiently efficient, since large volumes of data arrive in a stream and these data points unfold with time. The problem of clustering time-evolving metric data and categorical time-evolving data has separately been well explored in recent years, but the problem of clustering mixed type time-evolving data remains a challenging issue due to an awkward gap between the structure of metric and categorical attributes. In this paper, we devise a generalized framework, termed Equi-Clustream to dynamically cluster mixed type time-evolving data, which comprises three algorithms: a Hybrid Drifting Concept Detection Algorithm that detects the drifting concept between the current sliding window and previous sliding window, a Hybrid Data Labeling Algorithm that assigns an appropriate cluster label to each data vector of the current non-drifting window based on the clustering result of the previous sliding window, and a visualization algorithm that analyses the relationship between the clusters at different timestamps and also visualizes the evolving trends of the clusters. The efficacy of the proposed framework is shown by experiments on synthetic and real world datasets.
引用
收藏
页码:973 / 995
页数:23
相关论文
共 50 条
  • [41] Affinity Learning for Mixed Data Clustering
    Li, Nan
    Latecki, Longin Jan
    PROCEEDINGS OF THE TWENTY-SIXTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2173 - 2179
  • [42] Explainable AI for Mixed Data Clustering
    Amling, Jonas
    Scheele, Stephan
    Slany, Emanuel
    Lang, Moritz
    Schmid, Ute
    EXPLAINABLE ARTIFICIAL INTELLIGENCE, PT II, XAI 2024, 2024, 2154 : 42 - 62
  • [43] Clustering of Mixed data: A GKMM approach
    Sharma, Abha
    Thakur, R. S.
    INTERNATIONAL JOURNAL OF ADVANCED BIOTECHNOLOGY AND RESEARCH, 2016, 7 (02): : 651 - 653
  • [44] A semiparametric method for clustering mixed data
    Alex Foss
    Marianthi Markatou
    Bonnie Ray
    Aliza Heching
    Machine Learning, 2016, 105 : 419 - 458
  • [45] A kind of Spectral Clustering for Mixed Data
    Gou, Hong Yun
    Zhou, Yong
    IITAW: 2009 THIRD INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION TECHNOLOGY APPLICATIONS WORKSHOPS, 2009, : 416 - 419
  • [46] A semiparametric method for clustering mixed data
    Foss, Alex
    Markatou, Marianthi
    Ray, Bonnie
    Heching, Aliza
    MACHINE LEARNING, 2016, 105 (03) : 419 - 458
  • [47] A unified framework for approximating and clustering data
    California Institute of Technology, Pasadena, CA 91125, United States
    不详
    Proc. Annu. ACM Symp. Theory Comput., (569-578):
  • [48] A Unified Framework for Approximating and Clustering Data
    Feldman, Dan
    Langberg, Michael
    STOC 11: PROCEEDINGS OF THE 43RD ACM SYMPOSIUM ON THEORY OF COMPUTING, 2011, : 569 - 578
  • [49] An Adaptive Framework for Clustering Data Streams
    Chandrika
    Kumar, K. R. Ananda
    ADVANCES IN COMPUTING AND COMMUNICATIONS, PT I, 2011, 190 : 704 - +
  • [50] Lunatory: A Real-Time Distributed Trajectory Clustering Framework for Web Big Data
    Wu, Yang
    Pan, Zhicheng
    Chao, Pingfu
    Fang, Junhua
    Chen, Wei
    Zhao, Lei
    WEB ENGINEERING (ICWE 2022), 2022, 13362 : 219 - 234