Enhancing the DISSFCM Algorithm for Data Stream Classification

被引:3
|
作者
Casalino, Gabriella [1 ,2 ]
Castellano, Giovanna [1 ,2 ]
Fanelli, Anna Maria [1 ]
Mencar, Corrado [1 ,2 ]
机构
[1] Univ Bari Aldo Moro, Comp Sci Dept, Bari, Italy
[2] INdAM Res Grp GNCS, Rome, Italy
来源
关键词
Data stream classification; Semi-supervised fuzzy clustering; Incremental adaptive clustering;
D O I
10.1007/978-3-030-12544-8_9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Analyzing data streams has become a new challenge to meet the demands of real time analytics. Conventional mining techniques are proving inefficient to cope with challenges associated with data streams, including resources constraints like memory and running time along with single scan of the data. Most existing data stream classification methods require labeled samples that are more difficult and expensive to obtain than unlabeled ones. Semi-supervised learning algorithms can solve this problem by using unlabeled samples together with a few labeled ones to build classification models. Recently we proposed DISSFCM, an algorithm for data stream classification based on incremental semi-supervised fuzzy clustering. To cope with the evolution of data, DISSFCM adapts dynamically the number of clusters by splitting large-scale clusters. While splitting is effective in improving the quality of clusters, a repeated application without counter-balance may induce many small-scale clusters. To solve this problem, in this paper we enhance DISSFCM by introducing a procedure that merges small-scale clusters. Preliminary experimental results on a real-world benchmark dataset show the effectiveness of the method.
引用
收藏
页码:109 / 122
页数:14
相关论文
共 50 条
  • [1] A New Data Stream Classification Algorithm
    Liang, Hong-shuo
    Jin, Li-qun
    Zhao, Li
    [J]. PROCEEDINGS OF 2013 2ND INTERNATIONAL CONFERENCE ON MEASUREMENT, INFORMATION AND CONTROL (ICMIC 2013), VOLS 1 & 2, 2013, : 477 - 481
  • [2] Online Classification Algorithm for Uncertain Data Stream in Big Data
    Lyu Y.X.
    Wang C.R.
    Wang C.
    Yu C.Y.
    [J]. Lyu, Yan Xia (shaoqilyx@163.com), 1600, Northeast University (37): : 1245 - 1249
  • [3] A New Feature Selection Algorithm for Stream Data Classification
    Wankhade, Kapil
    Rane, Dhiraj
    Thool, Ravindra
    [J]. 2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 1843 - 1848
  • [4] An improved Apriori algorithm based on data stream classification
    [J]. Liu, Qiang, 1600, Binary Information Press (10):
  • [5] Adaptive Classification Algorithm for Concept Drift Data Stream
    Cai H.
    Lu K.
    Wu Q.
    Wu D.
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2022, 59 (03): : 633 - 646
  • [6] A Statistical Decision Tree Algorithm for Data Stream Classification
    Cazzolato, Mirela Teixeira
    Ribeiro, Marcela Xavier
    Yaguinuma, Cristiane
    Prado Santos, Marilde Terezinha
    [J]. ICEIS: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1, 2013, : 217 - 223
  • [7] On an Improved SPRINT Data Stream Online Classification Algorithm
    Zhou, Hong
    Fu, Chunyan
    Xue, Jiamei
    Zhi, Yuan
    Liu, Jingshun
    [J]. ADVANCES IN MANUFACTURING SCIENCE AND ENGINEERING, PTS 1-4, 2013, 712-715 : 2648 - 2652
  • [8] Research on classification query optimization algorithm in data stream
    Zhou Hong
    Wang Bin
    Fu Chunyan
    Zhi Yuan
    Xue Jiamei
    [J]. PROCEEDINGS OF THE 2015 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER ENGINEERING AND ELECTRONICS (ICECEE 2015), 2015, 24 : 1219 - 1223
  • [9] A Novel Algorithm for Distributed Data Stream Using Big Data Classification Model
    Qiu, Yongxiao
    Du, Guanghui
    Chai, Song
    [J]. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2020, 15 (04) : 1 - 17
  • [10] An improved Hoeffding-ID data-stream classification algorithm
    Yin, Chunyong
    Feng, Lu
    Ma, Luyu
    [J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (07): : 2670 - 2681