Data Stream Classification by Adaptive Semi-supervised Fuzzy Clustering

被引:0
|
作者
Castellano, Giovanna [1 ]
Fanelli, Anna Maria [1 ]
机构
[1] Univ Bari Aldo Moro, Dept Comp Sci, Bari, Italy
关键词
Data stream classification; Semi-supervised clustering;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The analysis and classification of data streams has attracted much attention recently due to the increasing amount of applications that produce streaming data. Most of the existing work relevant to data stream classification assume that all data are completely labeled. However in many applications, labeled data are difficult or expensive to obtain, meanwhile unlabeled data are relatively easy to collect. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. In [1] we introduced a data stream classification method based on an incremental semi-supervised fuzzy clustering algorithm. The method assumes that data belonging to different classes are continuously available during time and processed as chunks. The clusters are formed from a chunk via the SSFCM (Semi-Supervised FCM) clustering and when the next chunk becomes available the clustering is run again starting from cluster prototypes inherited from the previous chunk. The algorithm creates a fixed number of clusters that is set equal to the number of classes. In real-world contexts the underlying distribution of data may change over the time, hence a fixed number of clusters may not capture adequately the evolving structure of streaming data. To overcome this limitation in this work we extend the method proposed in [1] by introducing the capability to adapt dynamically the number of clusters. When the cluster quality deteriorates from one data chunk to another, the number of clusters is increased (by splitting some clusters) or also decreased (by merging some clusters). The cluster quality is evaluated in terms of the reconstruction error [2] that measures the difference between the original data and their "reconstructed" counterpart derived using the clustering outcome (prototypes and membership degrees). Preliminary experimental results on the benchmark data set KDD-CUP99 show that the proposed adaptive version of the data stream classification method outperforms the previous static version and is more robust in presence of outliers.
引用
收藏
页码:770 / 771
页数:2
相关论文
共 50 条
  • [41] Semi-Supervised Clustering and Aggregation of Relational Data
    Frigui, Hichem
    Hwang, Cheul
    [J]. 2008 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS, VOLS 1-3, 2008, : 1087 - 1092
  • [42] Fuzzy Semi-supervised Clustering with Active Constraint Selection
    Novoselova, Natalia
    Tom, Igor
    [J]. PATTERN RECOGNITION AND INFORMATION PROCESSING, 2017, 673 : 132 - 139
  • [43] On Semi-Supervised Fuzzy c-Means Clustering
    Yasunori, Endo
    Yukihiro, Hamasuna
    Makito, Yamashiro
    Sadaaki, Miyamoto
    [J]. 2009 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 2009, : 1119 - +
  • [44] Incremental Semi-Supervised Fuzzy Clustering for Shape Annotation
    Castellano, Giovanna
    Fanelli, Anna Maria
    Torsello, Maria Alessandra
    [J]. 2014 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE FOR MULTIMEDIA, SIGNAL AND VISION PROCESSING (CIMSIVP), 2014, : 190 - 194
  • [45] A New semi-supervised clustering for incomplete data
    Goel, Sonia
    Tushir, Meena
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2022, 42 (02) : 727 - 739
  • [46] A Semi-Supervised Deep Fuzzy C-Mean Clustering for Two Classes Classification
    Arshad, Ali
    Riaz, Saman
    Jiao, Licheng
    Murthy, Aparna
    [J]. 2017 IEEE 3RD INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC), 2017, : 365 - 370
  • [47] Semi-supervised Probabilistic Distance Clustering and the Uncertainty of Classification
    Iyigun, Cem
    Ben-Israel, Adi
    [J]. ADVANCES IN DATA ANALYSIS, DATA HANDLING AND BUSINESS INTELLIGENCE, 2010, : 3 - 20
  • [48] Use of Distributed Semi-Supervised Clustering for Text Classification
    Li, Pei
    Deng, Ze
    [J]. JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2019, 28 (08)
  • [49] Adaptive Betweenness Clustering for Semi-Supervised Domain Adaptation
    Li, Jichang
    Li, Guanbin
    Yu, Yizhou
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 5580 - 5594
  • [50] Adaptive and structured graph learning for semi-supervised clustering
    Chen, Long
    Zhong, Zhi
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (04)