An evolving approach to data streams clustering based on typicality and eccentricity data analytics

被引:37
|
作者
Bezerra, Clauber Gomes [1 ]
Jales Costa, Bruno Sielly [2 ]
Guedes, Luiz Affonso [3 ]
Angelov, Plamen Parvanov [4 ]
机构
[1] Fed Inst Educ Sci & Technol Rio Grande Norte do N, Campus Natal Zona Leste, BR-59015000 Natal, RN, Brazil
[2] Fed Inst Educ Sci & Technol Rio Grande Norte do N, Campus Natal Zona Norte Rua Brusque 2926, BR-59112490 Natal, RN, Brazil
[3] Fed Inst Educ Sci & Technol Rio Grande Norte do N, Dept Comp Engn & Automat, DCA Campus Univ, BR-59078900 Natal, RN, Brazil
[4] Univ Lancaster, Sch Comp & Commun, Data Sci Grp, Lancaster LA1 4WA, England
关键词
Online clustering; Data stream; Eccentricity; Typicality; Anomaly detection; FAULT-DETECTION; FUZZY; IDENTIFICATION; CLASSIFICATION;
D O I
10.1016/j.ins.2019.12.022
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we propose an algorithm for online clustering of data streams. This algorithm is called AutoCloud and is based on the recently introduced concept of Typicality and Eccentricity Data Analytics, mainly used for anomaly detection tasks. AutoCloud is an evolving, online and recursive technique that does not need training or prior knowledge about the data set. Thus, AutoCloud is fully online, requiring no offline processing. It allows creation and merging of clusters autonomously as new data observations become available. The clusters created by AutoCloud are called data clouds, which are structures without pre-defined shape or boundaries. AutoCloud allows each data sample to belong to multiple data clouds simultaneously using fuzzy concepts. AutoCloud is also able to handle concept drift and concept evolution, which are problems that are inherent in data streams in general. Since the algorithm is recursive and online, it is suitable for applications that require a real-time response. We validate our proposal with applications to multiple well known data sets in the literature. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:13 / 28
页数:16
相关论文
共 50 条
  • [1] Unsupervised Classification of Data Streams based on Typicality and Eccentricity Data Analytics
    Jales Costa, Bruno Sielly
    Bezerra, Clauber Gomes
    Guedes, Luiz Affonso
    Parvanov Angelov, Plamen
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2016, : 58 - 63
  • [2] Online Fault Detection Based on Typicality and Eccentricity Data Analytics
    Jales Costa, Bruno Sielly
    Bezerra, Clauber Gomes
    Guedes, Luiz Affonso
    Angelov, Plamen Parvanov
    [J]. 2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [3] Clustering Based Active Learning for Evolving Data Streams
    Ienco, Dino
    Bifet, Albert
    Zliobaite, Indre
    Pfahringer, Bernhard
    [J]. DISCOVERY SCIENCE, 2013, 8140 : 79 - 93
  • [4] Dynamically Evolving Clustering for Data Streams
    Baruah, Rashmi Dutta
    Angelov, Plamen
    Baruah, Diganta
    [J]. 2014 IEEE CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS), 2014,
  • [5] Improved Data Stream Clustering Method: Incorporating KD-Tree for Typicality and Eccentricity-Based Approach
    Xu, Dayu
    Lu, Jiaming
    Zhang, Xuyao
    Zhang, Hongtao
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2024, 78 (02): : 2557 - 2573
  • [6] SPARSE SUBSPACE CLUSTERING FOR EVOLVING DATA STREAMS
    Sui, Jinping
    Liu, Zhen
    Liu, Li
    Jung, Alexander
    Liu, Tianpeng
    Peng, Bo
    Li, Xiang
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7455 - 7459
  • [7] Online embedding and clustering of evolving data streams
    Zubaroglu, Alaettin
    Atalay, Volkan
    [J]. STATISTICAL ANALYSIS AND DATA MINING, 2023, 16 (01) : 29 - 44
  • [8] Dynamic Clustering Scheme for Evolving Data Streams Based on Improved STRAP
    Sui, Jinping
    Liu, Zhen
    Jung, Alexander
    Liu, Li
    Li, Xiang
    [J]. IEEE ACCESS, 2018, 6 : 46157 - 46166
  • [9] Optimised Clustering Based Approach for Healthcare Data Analytics
    Bhopale, Amol P.
    Zanwar, Sanskar
    Balpande, Aarya
    Kazi, Jaweria
    [J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2023, 14 (01): : 298 - 305
  • [10] Clustering based approach for incomplete data streams processing
    Najib, Fatma M.
    Ismail, Rasha M.
    Badr, Nagwa L.
    Gharib, Tarek F.
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (03) : 3213 - 3227