A Novel Rough Set Based Clustering Approach for Streaming Data

被引:0
|
作者
Yogita [1 ]
Toshniwal, Durga [1 ]
机构
[1] Indian Inst Technol, Roorkee, Uttar Pradesh, India
关键词
Clustering; Streaming data; Cluster approximation; Rough set;
D O I
10.1007/978-81-322-1602-5_131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a very important data mining task. Clustering of streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving in data over time. Inherent uncertainty involved in real world data stream further magnifies the challenge of working with streaming data. Rough set is a soft computing technique which can be used to deal with uncertainty involved in cluster analysis. In this paper, we propose a novel rough set based clustering method for streaming data. It describes a cluster as a pair of lower approximation and an upper approximation. Lower approximation comprises of the data objects that can be assigned with certainty to the respective cluster, whereas upper approximation contains those data objects whose belongingness to the various clusters in not crisp along with the elements of lower approximation. Uncertainty in assigning a data object to a cluster is captured by allowing overlapping in upper approximation. Proposed method generates soft-cluster. Keeping in view the challenges of streaming data, the proposed method is incremental and adaptive to evolving concept. Experimental results on synthetic and real world data sets show that our proposed approach outperforms Leader clustering algorithm in terms of classification accuracy. Proposed method generates more natural clusters as compare to k-means clustering and it is robust to outliers. Performance of proposed method is also analyzed in terms of correctness and accuracy of rough clustering.
引用
收藏
页码:1253 / 1265
页数:13
相关论文
共 50 条
  • [41] Student Management Based on Rough Set and Clustering
    Ren, Xueli
    Dai, Yubiao
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON SENSOR NETWORK AND COMPUTER ENGINEERING, 2016, 68 : 501 - 505
  • [42] Clustering of Web Learners Based on Rough Set
    LIU Shuai-dong 1
    2.National Engineering Research Center for Multimedia Software
    Wuhan University Journal of Natural Sciences, 2004, (05) : 542 - 546
  • [43] A rough set-based fuzzy clustering
    Zhao, YQ
    Zhou, XZ
    Tang, GZ
    INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS, 2005, 3689 : 401 - 409
  • [44] Data clustering using variable precision rough set
    Yanto, Iwan Tri Riyadi
    Herawan, Tutut
    Deris, Mustafa Mat
    INTELLIGENT DATA ANALYSIS, 2011, 15 (04) : 465 - 482
  • [45] A Novel Streaming Data Clustering Algorithm Based on Fitness Proportionate Sharing
    Yan, Xuyang
    Jahromi, Mohammad Razeghi
    Homaifar, Abdollah
    Erol, Berat A.
    Girma, Abenezer
    Tunstel, Edward
    IEEE ACCESS, 2019, 7 (184985-185000) : 184985 - 185000
  • [46] A Novel Clustering-Based Sampling Approach for Minimum Sample Set in Big Data Environment
    Zhao, Jia
    Sun, Jia
    Zhai, Yunan
    Ding, Yan
    Wu, Chunyi
    Hu, Ming
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2018, 32 (02)
  • [47] Online streaming feature selection based on neighborhood rough set
    Li, Shuangjie
    Zhang, Kaixiang
    Li, Yali
    Wang, Shuqin
    Zhang, Shaoqiang
    APPLIED SOFT COMPUTING, 2021, 113
  • [48] A novel approach based on rough set theory for analyzing information disorder
    Angelo Gaeta
    Vincenzo Loia
    Luigi Lomasto
    Francesco Orciuoli
    Applied Intelligence, 2023, 53 : 15993 - 16014
  • [49] Rough set approach for clustering categorical data using information-theoretic dependency measure
    Park, In-Kyoo
    Choi, Gyoo-Seok
    INFORMATION SYSTEMS, 2015, 48 : 289 - 295
  • [50] A novel approach based on rough set theory for analyzing information disorder
    Gaeta, Angelo
    Loia, Vincenzo
    Lomasto, Luigi
    Orciuoli, Francesco
    APPLIED INTELLIGENCE, 2023, 53 (12) : 15993 - 16014