An Evolving Fuzzy Model to Determine an Optimal Number of Data Stream Clusters

被引:1
|
作者
Al-Khamees, Hussein A. A. [1 ]
Al-A'araji, Nabeel [1 ]
Al-Shamery, Eman S. [1 ]
机构
[1] Babylon Univ, Dept Software, Babylon, Iraq
关键词
Data stream clustering; Clusters number; Evolving mechanisms;
D O I
10.5391/IJFIS.2022.22.3.267
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Data streams are a modern type of data that differ from traditional data in various characteristics: their indefinite size, high access, and concept drift due to their origin in non-stationary environments. Data stream clustering aims to split these data samples into significant clusters, depending on their similarity. The main drawback of data stream clustering algorithms is the large number of clusters they produce. Therefore, determining an optimal number of clusters is an important challenge for these algorithms. In practice, evolving models can change their general structure by implementing different mechanisms. This paper presents a fuzzy model that mainly consists of an evolving Cauchy clustering algorithm which is updated through a specific membership function and determines the optimal number of clusters by implementing two evolving mechanisms: adding and splitting clusters. The proposed model was tested on six different streaming datasets, namely, power supply, sensor, HuGaDB, UCI-HAR, Luxembourg, and keystrokes. The results demonstrated that the efficiency of the proposed model in producing an optimal number of clusters for each dataset outperforms that of previous models.
引用
收藏
页码:267 / 275
页数:9
相关论文
共 50 条
  • [1] Determine the number of clusters by data augmentation
    Luo, Wei
    [J]. ELECTRONIC JOURNAL OF STATISTICS, 2022, 16 (02): : 3910 - 3936
  • [2] The upper bound of the optimal number of clusters in fuzzy clustering
    于剑
    程乾生
    [J]. Science China(Information Sciences), 2001, (02) : 119 - 125
  • [3] The upper bound of the optimal number of clusters in fuzzy clustering
    Jian Yu
    Qiansheng Cheng
    [J]. Science in China Series : Information Sciences, 2001, 44 (2): : 119 - 125
  • [4] A Selection Model for Optimal Fuzzy Clustering Algorithm and Number of Clusters Based on Competitive Comprehensive Fuzzy Evaluation
    Wang, Yaonan
    Li, Chunsheng
    Zuo, Yi
    [J]. IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2009, 17 (03) : 568 - 577
  • [5] A Novel Model to Determine the Optimal Number of Servers in Finite Input Source Fuzzy Queueing System
    Meng, Yanli
    Liu, Xiaodong
    Zhou, Muyan
    [J]. PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 4175 - 4180
  • [6] A New Approach to Determine the Optimal Number of Clusters Based on the Gap Statistic
    Yang, Jaekyung
    Lee, Jong-Yeong
    Choi, Myoungjin
    Joo, Yeongin
    [J]. MACHINE LEARNING FOR NETWORKING (MLN 2019), 2020, 12081 : 227 - 239
  • [7] New approach to determine the optimal number of clusters K in unsupervised classification
    Chabih, Oussama
    Sbai, Sara
    Behja, Hicham
    Louhdi, Mohammed Reda Chbihi
    Zemmouri, El Moukhtar
    Trousse, Brigitte
    [J]. 2020 6TH IEEE CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'20), 2020, : 348 - 352
  • [8] On cluster validity index for estimation of the optimal number of fuzzy clusters
    Kim, DW
    Lee, KH
    Lee, DH
    [J]. PATTERN RECOGNITION, 2004, 37 (10) : 2009 - 2025
  • [9] Adaptive fuzzy partitions for evolving association rules in big data stream
    Ruiz, Elena
    Casillas, Jorge
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2018, 93 : 463 - 486
  • [10] Variable Weighting in Fuzzy k-Means Clustering to Determine the Number of Clusters
    Khan, Imran
    Luo, Zongwei
    Huang, Joshua Zhexue
    Shahzad, Waseem
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (09) : 1838 - 1853