Optimizing Data Stream Representation: An Extensive Survey on Stream Clustering Algorithms

被引:0
|
作者
Matthias Carnein
Heike Trautmann
机构
[1] University of Münster,Information Systems and Statistics
关键词
Stream clustering; Data streams; Online clustering; Pattern recognition; Decision support; Data representation;
D O I
暂无
中图分类号
学科分类号
摘要
Analyzing data streams has received considerable attention over the past decades due to the widespread usage of sensors, social media and other streaming data sources. A core research area in this field is stream clustering which aims to recognize patterns in an unordered, infinite and evolving stream of observations. Clustering can be a crucial support in decision making, since it aims for an optimized aggregated representation of a continuous data stream over time and allows to identify patterns in large and high-dimensional data. A multitude of algorithms and approaches has been developed that are able to find and maintain clusters over time in the challenging streaming scenario. This survey explores, summarizes and categorizes a total of 51 stream clustering algorithms and identifies core research threads over the past decades. In particular, it identifies categories of algorithms based on distance thresholds, density grids and statistical models as well as algorithms for high dimensional data. Furthermore, it discusses applications scenarios, available software and how to configure stream clustering algorithms. This survey is considerably more extensive than comparable studies, more up-to-date and highlights how concepts are interrelated and have been developed over time.
引用
收藏
页码:277 / 297
页数:20
相关论文
共 50 条
  • [1] Optimizing Data Stream Representation: An Extensive Survey on Stream Clustering Algorithms
    Carnein, Matthias
    Trautmann, Heike
    [J]. BUSINESS & INFORMATION SYSTEMS ENGINEERING, 2019, 61 (03) : 277 - 297
  • [2] Clustering data stream: A survey of algorithms
    Mahdiraji, Alireza
    [J]. INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2009, 13 (02) : 39 - 44
  • [3] Data Stream Clustering: A Survey
    Silva, Jonathan A.
    Faria, Elaine R.
    Barros, Rodrigo C.
    Hruschka, Eduardo R.
    de Carvalho, Andre C. P. L. F.
    Gama, Joao
    [J]. ACM COMPUTING SURVEYS, 2013, 46 (01)
  • [4] An evaluation of data stream clustering algorithms
    Mansalis, Stratos
    Ntoutsi, Eirini
    Pelekis, Nikos
    Theodoridis, Yannis
    [J]. STATISTICAL ANALYSIS AND DATA MINING, 2018, 11 (04) : 167 - 187
  • [5] Research on data stream clustering algorithms
    Shifei Ding
    Fulin Wu
    Jun Qian
    Hongjie Jia
    Fengxiang Jin
    [J]. Artificial Intelligence Review, 2015, 43 : 593 - 600
  • [6] Research on data stream clustering algorithms
    Ding, Shifei
    Wu, Fulin
    Qian, Jun
    Jia, Hongjie
    Jin, Fengxiang
    [J]. ARTIFICIAL INTELLIGENCE REVIEW, 2015, 43 (04) : 593 - 600
  • [7] A survey on data stream clustering and classification
    Hai-Long Nguyen
    Woon, Yew-Kwong
    Ng, Wee-Keong
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (03) : 535 - 569
  • [8] A survey on data stream clustering and classification
    Hai-Long Nguyen
    Yew-Kwong Woon
    Wee-Keong Ng
    [J]. Knowledge and Information Systems, 2015, 45 : 535 - 569
  • [9] A Comparative Study on Data Stream Clustering Algorithms
    Keshvani, Twinkle
    Shukla, Madhu
    [J]. PROCEEDING OF THE INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS, BIG DATA AND IOT (ICCBI-2018), 2020, 31 : 219 - 230
  • [10] A Review of Uncertain Data Stream Clustering Algorithms
    Yang, Yue
    Liu, Zhuo
    Xing, Zhidan
    [J]. 2015 EIGHTH INTERNATIONAL CONFERENCE ON INTERNET COMPUTING FOR SCIENCE AND ENGINEERING (ICICSE), 2015, : 111 - 116