HCLUWIN: AN ALGORITHM FOR CLUSTERING HETEROGENEOUS DATA STREAMS OVER SLIDING WINDOWS

被引:0
|
作者
Ren, Jiadong [1 ,2 ]
Hu, Changzhen [2 ]
Ma, Ruiqing [1 ]
机构
[1] Yanshan Univ, Coll Informat Sci & Engn, Qinhuangdao, Peoples R China
[2] Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China
关键词
Data stream; Clustering; Heterogeneous attribute; Sliding windows;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many applications in web usage mining, such as business intelligence and usage characterization, require effective and efficient techniques to discover the users with similar usage patterns and the web pages with correlate contents in the physical world. Clustering click streams can help to achieve the goal. Despite the high processing rate, the existing methods for clustering click streams over sliding widows suffer from the missing of categorical attributes in click stream data. In this paper, we present HCluWin, an approach for clustering heterogeneous data streams which contain both continuous attributes and categorical attributes over sliding windows. A Heterogeneous Temporal Cluster Feature (HTCF) is introduced to monitor the distribution statistics of heterogeneous data points. Based on this structure, Exponential Histogram of Heterogeneous Cluster Feature (EHHCF) is presented. Simultaneously, a new similarity measure between two heterogeneous objects is proposed. Experimental results show that the clustering quality of HCluWin is higher than CluWin and the stream processing rate of HCluWin is higher than HClu Stream.
引用
收藏
页码:2171 / 2179
页数:9
相关论文
共 50 条
  • [1] Clustering Data Streams over Sliding Windows by DCA
    Ta Minh Thuy
    Le Thi Hoai An
    Boudjeloud-Assala, Lydia
    [J]. ADVANCED COMPUTATIONAL METHODS FOR KNOWLEDGE ENGINEERING, 2013, 479 : 65 - 75
  • [2] Clustering Heterogeneous Data Streams with Uncertainty over Sliding Window
    Hentech, Houda
    Gouider, Mohammed Salah
    Farhat, Amine
    [J]. MODEL AND DATA ENGINEERING, MEDI 2013, 2013, 8216 : 162 - 175
  • [3] An EM-Based Algorithm for Clustering Data Streams in Sliding Windows
    Dang, Xuan Hong
    Lee, Vincent
    Ng, Wee Keong
    Ciptadi, Arridhang
    Ong, Kok Leong
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2009, 5463 : 230 - +
  • [4] Sliding windows over uncertain data streams
    Dallachiesa, Michele
    Jacques-Silva, Gabriela
    Gedik, Bugra
    Wu, Kun-Lung
    Palpanas, Themis
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2015, 45 (01) : 159 - 190
  • [5] Sliding windows over uncertain data streams
    Michele Dallachiesa
    Gabriela Jacques-Silva
    Buğra Gedik
    Kun-Lung Wu
    Themis Palpanas
    [J]. Knowledge and Information Systems, 2015, 45 : 159 - 190
  • [6] Clustering Algorithm for High Dimensional Data Stream over Sliding Windows
    Liu, Weiguo
    OuYang, Jia
    [J]. TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, : 1537 - 1542
  • [7] StreamSW: A density-based approach for clustering data streams over sliding windows
    Reddy, K. Shyam Sunder
    Bindu, C. Shoba
    [J]. MEASUREMENT, 2019, 144 : 14 - 19
  • [8] Partition-Based Clustering with Sliding Windows for Data Streams
    Youn, Jonghem
    Choi, Jihun
    Shim, Junho
    Lee, Sang-goo
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2017), PT II, 2017, 10178 : 289 - 303
  • [9] Sketching asynchronous data streams over sliding windows
    Bojian Xu
    Srikanta Tirthapura
    Costas Busch
    [J]. Distributed Computing, 2008, 20 : 359 - 374
  • [10] Dynamic adjustment of sliding windows over data streams
    Zhang, DD
    Li, JZ
    Zhang, ZG
    Wang, WP
    Guo, LJ
    [J]. ADVANCES IN WEB-AGE INFORMATION MANAGEMENT: PROCEEDINGS, 2004, 3129 : 24 - 33