A Novel Rough Set Based Clustering Approach for Streaming Data

被引:0
|
作者
Yogita [1 ]
Toshniwal, Durga [1 ]
机构
[1] Indian Inst Technol, Roorkee, Uttar Pradesh, India
关键词
Clustering; Streaming data; Cluster approximation; Rough set;
D O I
10.1007/978-81-322-1602-5_131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a very important data mining task. Clustering of streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving in data over time. Inherent uncertainty involved in real world data stream further magnifies the challenge of working with streaming data. Rough set is a soft computing technique which can be used to deal with uncertainty involved in cluster analysis. In this paper, we propose a novel rough set based clustering method for streaming data. It describes a cluster as a pair of lower approximation and an upper approximation. Lower approximation comprises of the data objects that can be assigned with certainty to the respective cluster, whereas upper approximation contains those data objects whose belongingness to the various clusters in not crisp along with the elements of lower approximation. Uncertainty in assigning a data object to a cluster is captured by allowing overlapping in upper approximation. Proposed method generates soft-cluster. Keeping in view the challenges of streaming data, the proposed method is incremental and adaptive to evolving concept. Experimental results on synthetic and real world data sets show that our proposed approach outperforms Leader clustering algorithm in terms of classification accuracy. Proposed method generates more natural clusters as compare to k-means clustering and it is robust to outliers. Performance of proposed method is also analyzed in terms of correctness and accuracy of rough clustering.
引用
收藏
页码:1253 / 1265
页数:13
相关论文
共 50 条
  • [21] A rough set based subspace clustering technique for high dimensional data
    Lakshmi, B. Jaya
    Shashi, M.
    Madhuri, K. B.
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2020, 32 (03) : 329 - 334
  • [22] Clustering of Users on Microblogging Social Media: A Rough Set Based Approach
    Gupta, Mukul
    Kumar, Pradeep
    Bhasker, Bharat
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON DATA SCIENCE & ENGINEERING (ICDSE), 2016, : 59 - 64
  • [23] A Game-Theoretic Rough Set Approach for Handling Missing Data in Clustering
    Azam, Nouman
    Afridi, Mohammad Khan
    Yao, JingTao
    [J]. RECENT TRENDS AND FUTURE TECHNOLOGY IN APPLIED INTELLIGENCE, IEA/AIE 2018, 2018, 10868 : 635 - 647
  • [24] Rough Cuckoo Search: A Novel Mathematics Based Optimization Approach Based on Rough Set
    Ray, Swarnajit
    Dhal, Krishna Gopal
    Naskar, Prabir Kumar
    [J]. PATTERN RECOGNITION AND IMAGE ANALYSIS, 2022, 32 (01) : 228 - 247
  • [25] Rough Cuckoo Search: A Novel Mathematics Based Optimization Approach Based on Rough Set
    Swarnajit Ray
    Krishna Gopal Dhal
    Prabir Kumar Naskar
    [J]. Pattern Recognition and Image Analysis, 2022, 32 : 228 - 247
  • [26] New Online Streaming Feature Selection Based on Neighborhood Rough Set for Medical Data
    Lei, Dingfei
    Liang, Pei
    Hu, Junhua
    Yuan, Yuan
    [J]. SYMMETRY-BASEL, 2020, 12 (10): : 1 - 31
  • [27] A novel attribute reduction approach for multi-label data based on rough set theory
    Li, Hua
    Li, Deyu
    Zhai, Yanhui
    Wang, Suge
    Zhang, Jing
    [J]. INFORMATION SCIENCES, 2016, 367 : 827 - 847
  • [28] The research of data mining approach based on rough set theory
    Zheng, Liying
    Li, Yongchang
    Liu, Liyan
    [J]. INFORMATION, MANAGEMENT AND ALGORITHMS, VOL II, 2007, : 97 - 101
  • [29] A novel rough set approach for classification
    Li-Juan, Zhang
    Zhou-Jun, Li
    [J]. 2006 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, 2006, : 349 - +
  • [30] Incomplete data analysis approach based on Rough set theory
    Zhang, Wei
    Liao, Xiaofeng
    Wu, Zhongfu
    [J]. Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2003, 16 (02):