Hierarchical clustering for multiple nominal data streams with evolving behaviour

被引:0
|
作者
Jerry W. Sangma
Mekhla Sarkar
Vipin Pal
Amit Agrawal
机构
[1] National Institute of Technology Meghalaya,
[2] Chang Gung University,undefined
[3] Wells Fargo & Company,undefined
来源
关键词
Data streams; Hierarchical clustering; Concept evolution; Nominal data;
D O I
暂无
中图分类号
学科分类号
摘要
Over the decade, a number of attempts have been made towards data stream clustering, but most of the works fall under clustering by example approach. There are a number of applications where clustering by variable approach is required which involves clustering of multiple data streams as opposed to clustering data examples in a data stream. Furthermore, a few works have been presented for clustering multiple data streams and these are applicable to numeric data streams only. Hence, this research gap has motivated current research work. In the present work, a hierarchical clustering technique has been proposed to cluster multiple data streams where data are nominal. To address the concept changes in the data streams splitting and merging of the clusters in the hierarchical structure are performed. The decision to split or merge is based on the entropy measure, representing the cluster’s degree of disparity. The performance of the proposed technique has been analysed and compared to Agglomerative Nesting clustering technique on synthetic as well as a real-world dataset in terms of Dunn Index, Modified Hubert Γ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\varGamma $$\end{document} statistic, Cophenetic Correlation Coefficient, and Purity. The proposed technique outperforms Agglomerative Nesting clustering technique for concept evolving data streams. Furthermore, the effect of concept evolution on clustering structure and average entropy has been visualised for detailed analysis and understanding.
引用
收藏
页码:1737 / 1761
页数:24
相关论文
共 50 条
  • [1] Hierarchical clustering for multiple nominal data streams with evolving behaviour
    Sangma, Jerry W.
    Sarkar, Mekhla
    Pal, Vipin
    Agrawal, Amit
    Yogita
    [J]. COMPLEX & INTELLIGENT SYSTEMS, 2022, 8 (02) : 1737 - 1761
  • [2] FHC-NDS: Fuzzy Hierarchical Clustering of Multiple Nominal Data Streams
    Sangma, Jerry W.
    Yogita
    Pal, Vipin
    Kumar, Neeraj
    Kushwaha, Riti
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2023, 31 (03) : 786 - 798
  • [3] Statistical hierarchical clustering algorithm for outlier detection in evolving data streams
    Dalibor Krleža
    Boris Vrdoljak
    Mario Brčić
    [J]. Machine Learning, 2021, 110 : 139 - 184
  • [4] Statistical hierarchical clustering algorithm for outlier detection in evolving data streams
    Krleza, Dalibor
    Vrdoljak, Boris
    Brcic, Mario
    [J]. MACHINE LEARNING, 2021, 110 (01) : 139 - 184
  • [5] Adaptive clustering for multiple evolving streams
    Dai, Bi-Ru
    Huang, Jen-Wei
    Yeh, Mi-Yen
    Chen, Ming-Syan
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (09) : 1166 - 1180
  • [6] Dynamically Evolving Clustering for Data Streams
    Baruah, Rashmi Dutta
    Angelov, Plamen
    Baruah, Diganta
    [J]. 2014 IEEE CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS), 2014,
  • [7] Online embedding and clustering of evolving data streams
    Zubaroglu, Alaettin
    Atalay, Volkan
    [J]. STATISTICAL ANALYSIS AND DATA MINING, 2023, 16 (01) : 29 - 44
  • [8] SPARSE SUBSPACE CLUSTERING FOR EVOLVING DATA STREAMS
    Sui, Jinping
    Liu, Zhen
    Liu, Li
    Jung, Alexander
    Liu, Tianpeng
    Peng, Bo
    Li, Xiang
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 7455 - 7459
  • [9] Clustering Multiple Data Streams
    Balzanella, Antonio
    Lechevallier, Yves
    Verde, Rosanna
    [J]. NEW PERSPECTIVES IN STATISTICAL MODELING AND DATA ANALYSIS, 2011, : 247 - 254
  • [10] Clustering over multiple evolving streams by events and correlations
    Yeh, Mi-Yen
    Dai, Bi-Ru
    Chen, Ming-Syan
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (10) : 1349 - 1362