Framework for real-time clustering over sliding windows

被引:3
|
作者
Badiozamany, Sobhan [1 ]
Orsborn, Kjell [1 ]
Risch, Tore [1 ]
机构
[1] Uppsala Univ, Box 337, SE-75105 Uppsala, Sweden
关键词
Sliding windows; Clustering; Framework;
D O I
10.1145/2949689.2949696
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering queries over sliding windows require maintaining cluster memberships that change as windows slide. To address this, the Generic 2-phase Continuous Summarization framework (G2CS) utilizes a generation based window maintenance approach where windows are maintained over different time intervals. It provides algorithm independent and efficient sliding mechanisms for clustering queries where the clustering algorithms are defined in terms of queries over cluster data represented as temporal tables. A particular challenge for real-time detection of a high number of fastly evolving clusters is efficiently supporting smooth re-clustering in real-time, i.e. to minimize the sliding time with increasing window size and decreasing strides. To efficiently support such re-clustering for clustering algorithms where deletion of expired data is not supported, e.g. BIRCH, G2CS includes a novel window maintenance mechanism called Sliding Binary Merge (SBM), which maintains several generations of intermediate window instances and does not require decremental cluster maintenance. To improve real-time sliding performance, G2CS uses generation-based multi-dimensional indexing. Extensive performance evaluation on both synthetic and real data shows that G2CS scales substantially better than related approaches.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] SHE: A Generic Framework for Data Stream Mining over Sliding Windows
    Wu, Yuhan
    Fan, Zhuochen
    Shi, Qilong
    Zhang, Yixin
    Yang, Tong
    Chen, Cheng
    Zhong, Zheng
    Li, Junnan
    Shtul, Ariel
    Tu, Yaofeng
    51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [42] Sliding Sketches: A Framework using Time Zones for Data Stream Processing in Sliding Windows
    Gou, Xiangyang
    He, Long
    Zhang, Yinda
    Wang, Ke
    Liu, Xilai
    Yang, Tong
    Wang, Yi
    Cui, Bin
    KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 1015 - 1025
  • [43] Clustering and feature extraction in a 3D real-time echo management framework
    Auran, PG
    Malvig, KE
    PROCEEDINGS OF THE 1996 SYMPOSIUM ON AUTONOMOUS UNDERWATER VEHICLE TECHNOLOGY, 1996, : 300 - 307
  • [44] Real-Time Sliding Friction Identification and Analysis
    Mears, Laine
    Falcon, Jeannie Sullivan
    Kurfess, Thomas
    IEEE CONTROL SYSTEMS MAGAZINE, 2008, 28 (06): : 20 - 28
  • [45] Real-Time Data ETL Framework for Big Real-Time Data Analysis
    Li, Xiaofang
    Mao, Yingchi
    2015 IEEE INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION, 2015, : 1289 - 1294
  • [46] Real-time Transmission Over Switched Ethernet Using a Contracts Based Framework
    Vila-Carbo, J.
    Tur-Masanet, J.
    Hernandez-Orallo, E.
    2009 IEEE CONFERENCE ON EMERGING TECHNOLOGIES & FACTORY AUTOMATION (EFTA 2009), 2009,
  • [47] ETL Framework for Real-Time Business Intelligence over Medical Imaging Repositories
    Tiago Marques Godinho
    Rui Lebre
    João Rafael Almeida
    Carlos Costa
    Journal of Digital Imaging, 2019, 32 : 870 - 879
  • [48] A Framework to Support Real-Time Applications over IEEE802.15.4 DSME
    Taneja, Mukesh
    2015 IEEE TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT SENSORS, SENSOR NETWORKS AND INFORMATION PROCESSING (ISSNIP), 2015,
  • [49] ETL Framework for Real-Time Business Intelligence over Medical Imaging Repositories
    Godinho, Tiago Marques
    Lebre, Rui
    Almeida, Joao Rafael
    Costa, Carlos
    JOURNAL OF DIGITAL IMAGING, 2019, 32 (05) : 870 - 879
  • [50] Spike clustering and neuron tracking over successive time windows
    Wolf, Michael T.
    Burdick, Joel W.
    2007 3RD INTERNATIONAL IEEE/EMBS CONFERENCE ON NEURAL ENGINEERING, VOLS 1 AND 2, 2007, : 663 - +