A Novel Rough Set Based Clustering Approach for Streaming Data

被引:0
|
作者
Yogita [1 ]
Toshniwal, Durga [1 ]
机构
[1] Indian Inst Technol, Roorkee, Uttar Pradesh, India
关键词
Clustering; Streaming data; Cluster approximation; Rough set;
D O I
10.1007/978-81-322-1602-5_131
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a very important data mining task. Clustering of streaming data is very challenging because streaming data cannot be scanned multiple times and also new concepts may keep evolving in data over time. Inherent uncertainty involved in real world data stream further magnifies the challenge of working with streaming data. Rough set is a soft computing technique which can be used to deal with uncertainty involved in cluster analysis. In this paper, we propose a novel rough set based clustering method for streaming data. It describes a cluster as a pair of lower approximation and an upper approximation. Lower approximation comprises of the data objects that can be assigned with certainty to the respective cluster, whereas upper approximation contains those data objects whose belongingness to the various clusters in not crisp along with the elements of lower approximation. Uncertainty in assigning a data object to a cluster is captured by allowing overlapping in upper approximation. Proposed method generates soft-cluster. Keeping in view the challenges of streaming data, the proposed method is incremental and adaptive to evolving concept. Experimental results on synthetic and real world data sets show that our proposed approach outperforms Leader clustering algorithm in terms of classification accuracy. Proposed method generates more natural clusters as compare to k-means clustering and it is robust to outliers. Performance of proposed method is also analyzed in terms of correctness and accuracy of rough clustering.
引用
收藏
页码:1253 / 1265
页数:13
相关论文
共 50 条
  • [31] A Rough Set Based Rule Induction Approach to Geoscience Data
    Hossain, Touhid Mohammad
    Watada, Junzo
    Hermana, Maman
    Shukri, Siti Rohkmah Bt M.
    Sakai, Hiroshi
    [J]. 2018 INTERNATIONAL CONFERENCE ON UNCONVENTIONAL MODELLING, SIMULATION AND OPTIMIZATION - SOFT COMPUTING AND META HEURISTICS - UMSO, 2018,
  • [32] Rough set based attribute reduction approach in data mining
    Li, K
    Liu, YS
    [J]. 2002 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-4, PROCEEDINGS, 2002, : 60 - 63
  • [33] Rough set approach to incomplete data
    Grzymala-Busse, JW
    [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING - ICAISC 2004, 2004, 3070 : 50 - 55
  • [34] A Rough Set Approach to Incomplete Data
    Grzymala-Busse, Jerzy W.
    [J]. ROUGH SETS AND KNOWLEDGE TECHNOLOGY, RSKT 2015, 2015, 9436 : 3 - 14
  • [35] Tolerance Rough Set Theory Based Data Summarization for Clustering Large Datasets
    Patra, Bidyut Kr.
    Nandi, Sukumar
    [J]. TRANSACTIONS ON ROUGH SETS XIV, 2011, 6600 : 139 - 158
  • [36] Rough Set based Attribute Clustering for Sample Classification of Gene Expression Data
    Nayak, Rudra Kalyan
    Mishra, Debahuti
    Shaw, Kailash
    Mishra, Sashikala
    [J]. INTERNATIONAL CONFERENCE ON MODELLING OPTIMIZATION AND COMPUTING, 2012, 38 : 1788 - 1792
  • [37] Document Clustering Based on Fuzzy Rough Set
    Zhou Peng
    Li Zhishu
    Cheng Yang
    Huang Zhiguo
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS, 2009, : 701 - +
  • [38] Clustering Based on Rough Set Knowledge Discovery
    Shan, Chen
    [J]. FUTURE COMPUTER, COMMUNICATION, CONTROL AND AUTOMATION, 2011, 119 : 561 - 565
  • [39] MMeNR: Neighborhood Rough Set Theory Based Algorithm for Clustering Heterogeneous Data
    Tripathy, B. K.
    Goyal, Akarsh
    Chowdhury, Rahul
    Banu, Sharmila K.
    [J]. PROCEEDINGS OF THE 2017 INTERNATIONAL CONFERENCE ON INVENTIVE COMMUNICATION AND COMPUTATIONAL TECHNOLOGIES (ICICCT), 2017, : 323 - 328
  • [40] Student Management Based on Rough Set and Clustering
    Ren, Xueli
    Dai, Yubiao
    [J]. PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON SENSOR NETWORK AND COMPUTER ENGINEERING, 2016, 68 : 501 - 505