Data Stream Classification by Adaptive Semi-supervised Fuzzy Clustering

被引:0
|
作者
Castellano, Giovanna [1 ]
Fanelli, Anna Maria [1 ]
机构
[1] Univ Bari Aldo Moro, Dept Comp Sci, Bari, Italy
关键词
Data stream classification; Semi-supervised clustering;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The analysis and classification of data streams has attracted much attention recently due to the increasing amount of applications that produce streaming data. Most of the existing work relevant to data stream classification assume that all data are completely labeled. However in many applications, labeled data are difficult or expensive to obtain, meanwhile unlabeled data are relatively easy to collect. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. In [1] we introduced a data stream classification method based on an incremental semi-supervised fuzzy clustering algorithm. The method assumes that data belonging to different classes are continuously available during time and processed as chunks. The clusters are formed from a chunk via the SSFCM (Semi-Supervised FCM) clustering and when the next chunk becomes available the clustering is run again starting from cluster prototypes inherited from the previous chunk. The algorithm creates a fixed number of clusters that is set equal to the number of classes. In real-world contexts the underlying distribution of data may change over the time, hence a fixed number of clusters may not capture adequately the evolving structure of streaming data. To overcome this limitation in this work we extend the method proposed in [1] by introducing the capability to adapt dynamically the number of clusters. When the cluster quality deteriorates from one data chunk to another, the number of clusters is increased (by splitting some clusters) or also decreased (by merging some clusters). The cluster quality is evaluated in terms of the reconstruction error [2] that measures the difference between the original data and their "reconstructed" counterpart derived using the clustering outcome (prototypes and membership degrees). Preliminary experimental results on the benchmark data set KDD-CUP99 show that the proposed adaptive version of the data stream classification method outperforms the previous static version and is more robust in presence of outliers.
引用
收藏
页码:770 / 771
页数:2
相关论文
共 50 条
  • [1] Incremental adaptive semi-supervised fuzzy clustering for data stream classification
    Casalino, Gabriella
    Castellano, Giovanna
    Mencar, Corrado
    [J]. PROCEEDINGS OF THE 2018 IEEE INTERNATIONAL CONFERENCE ON EVOLVING AND ADAPTIVE INTELLIGENT SYSTEMS (EAIS), 2018,
  • [2] Data Stream Classification by Dynamic Incremental Semi-Supervised Fuzzy Clustering
    Casalino, Gabriella
    Castellano, Giovanna
    Mencar, Corrado
    [J]. INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2019, 28 (08)
  • [3] Classification of Data Streams by Incremental Semi-supervised Fuzzy Clustering
    Castellano, G.
    Fanelli, A. M.
    [J]. FUZZY LOGIC AND SOFT COMPUTING APPLICATIONS, WILF 2016, 2017, 10147 : 185 - 194
  • [4] Text classification with enhanced semi-supervised fuzzy clustering
    Keswani, G
    Hall, LO
    [J]. PROCEEDINGS OF THE 2002 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOL 1 & 2, 2002, : 621 - 626
  • [5] Semi-Supervised Pattern Classification Utilizing Fuzzy Clustering and Nonlinear Mapping of Data
    Du, Weiwei
    Urahama, Kiichi
    [J]. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2007, 11 (09) : 1159 - 1164
  • [6] Incremental semi-supervised clustering in a data stream with a flock of agents
    Bruneau, Pierrick
    Picarougne, Fabien
    Gelgon, Marc
    [J]. 2009 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION, VOLS 1-5, 2009, : 3067 - 3074
  • [7] A genetic semi-supervised fuzzy clustering approach to text classification
    Liu, H
    Huang, ST
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2003, 2762 : 173 - 180
  • [8] Semi-Supervised Stream Clustering Using Labeled Data Points
    Treechalong, Kritsana
    Rakthanmanon, Thanawin
    Waiyamai, Kitsana
    [J]. MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION, MLDM 2015, 2015, 9166 : 281 - 295
  • [9] SAND: Semi-Supervised Adaptive Novel Class Detection and Classification over Data Stream
    Haque, Ahsanul
    Khan, Latifur
    Baron, Michael
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 1652 - 1658
  • [10] Active semi-supervised fuzzy clustering
    Grira, Nizar
    Crucianu, Michel
    Boujemaa, Nozha
    [J]. PATTERN RECOGNITION, 2008, 41 (05) : 1834 - 1844