Clustering over Evolving Data Streams Based on Online Recent-Biased Approximation

被引:0
|
作者
Fan, Wei [1 ]
Koyanagi, Yusuke [1 ]
Asakura, Koichi [2 ]
Watanabe, Toyohide [1 ]
机构
[1] Nagoya Univ, Dept Syst & Social Informat, Grad Sch Informat Sci, Chikusa Ku, Furo Cho, Nagoya, Aichi 4648603, Japan
[2] Daido Inst Technol, Sch Informat, Nagoya, Aichi 4578530, Japan
关键词
Clustering over evolving data streams; time series data; recent-biased approximation; data mining;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A growing number of real world applications deal with multiple evolving data streams. In this paper, a framework for clustering over evolving data streams is proposed taking advantage of recent-biased approximation. In recent-biased approximation, more details are preserved for recent data and fewer coefficients are kept for the whole data stream, which improves the efficiency of clustering and space usability greatly. Our framework consists of two phases. One is an online phase which approximates data streams and maintains the summary statistics incrementally. The other is an offline clustering phase which is able to perform dynamic clustering over data streams on all possible time horizons. As shown in complexity analyses and also validated by our empirical studies, our framework performed efficiently in the data stream environment while producing clustering results of very high quality.
引用
收藏
页码:12 / +
页数:3
相关论文
共 50 条
  • [41] S-RASTER: contraction clustering for evolving data streams
    Ulm, Gregor
    Smith, Simon
    Nilsson, Adrian
    Gustavsson, Emil
    Jirstrand, Mats
    [J]. JOURNAL OF BIG DATA, 2020, 7 (01)
  • [42] Hierarchical clustering for multiple nominal data streams with evolving behaviour
    Jerry W. Sangma
    Mekhla Sarkar
    Vipin Pal
    Amit Agrawal
    [J]. Complex & Intelligent Systems, 2022, 8 : 1737 - 1761
  • [43] Distributed weighted clustering of evolving sensor data streams with noise
    Hassani, Marwan
    Seidl, Thomas
    [J]. Journal of Digital Information Management, 2012, 10 (06): : 410 - 420
  • [44] S-RASTER: contraction clustering for evolving data streams
    Gregor Ulm
    Simon Smith
    Adrian Nilsson
    Emil Gustavsson
    Mats Jirstrand
    [J]. Journal of Big Data, 7
  • [45] A fuzzy c means variant for clustering evolving data streams
    Hore, Prodip
    Hall, Lawrence O.
    Goldgof, Dmitry B.
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS, VOLS 1-8, 2007, : 802 - 807
  • [46] Clustering over data streams based on grid density and index tree
    Ren J.
    Cai B.
    Hu C.
    [J]. Journal of Convergence Information Technology, 2011, 6 (01) : 83 - 93
  • [47] Statistical σ-partition clustering over data streams
    Park, NH
    Lee, WS
    [J]. KNOWLEDGE DISCOVERY IN DATABASES: PKDD 2003, PROCEEDINGS, 2003, 2838 : 387 - 398
  • [48] Clustering Based on Correlation Fractal Dimension Over an Evolving Data Stream
    Yarlagadda, Anuradha
    Jonnalagedda, Murthy
    Munaga, Krishna
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2018, 15 (01) : 1 - 9
  • [49] Density-Based Clustering over an Evolving Data Stream with Noise
    Cao, Feng
    Ester, Martin
    Qian, Weining
    Zhou, Aoying
    [J]. PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 328 - +
  • [50] Clustering over an evolving data stream based on grid density and correlation
    Ren, Jiadong
    Cai, Binlei
    Hu, Changzhen
    [J]. ICIC Express Letters, 2010, 4 (05): : 1603 - 1609