Tweet Cluster Analyzer: Partition and Join-based Micro-clustering for Twitter Data Stream

被引:1
|
作者
Raja, M. Arun Manicka [1 ]
Swamynathan, S. [1 ]
机构
[1] Anna Univ, Dept Informat Sci & Technol, Coll Engn Guindy, Madras 600025, Tamil Nadu, India
关键词
Tweets; Micro-clustering; Macro-clustering; Cosine similarity; DBStream; Partition; Join;
D O I
10.1007/978-981-10-3874-7_64
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data stream mining is the process of extracting knowledge from continuously generated data. Since data stream processing is not a trivial task, the streams have to be analyzed with proper stream mining techniques. In many large volume of data stream processing, stream clustering helps to find the valuable hidden information. Many works have concentrated on clustering the data streams using various methods, but mostly those approaches lack in some core tasks needed to improve the cluster accuracy and quick processing of data streams. To tackle the problem of improving cluster quality and reducing the time for data stream processing time in cluster generation, the partition-based DBStream clustering method is proposed. The result has been compared with various data stream clustering methods, and it is evident from the experiments that the purity of clusters improves 5% and the time taken is reduced by 10% than the average time taken by other methods for clustering the data streams.
引用
收藏
页码:671 / 682
页数:12
相关论文
共 11 条
  • [1] Data Stream Classification Method Combining Micro-Clustering and Active Learning
    Yin, Chunyong
    Chen, Shuangshuang
    [J]. Computer Engineering and Applications, 2023, 59 (20) : 254 - 265
  • [2] The Dynamic Hyper-ellipsoidal Micro-Clustering for Evolving Data Stream Using Only Incoming Datum
    Tangpathompong, Narongrid
    Suksawatchon, Ureerat
    Suksawatchon, Jakkarin
    [J]. IIP'17: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION PROCESSING, 2017,
  • [3] A data stream subspace clustering algorithm based on region partition
    [J]. Yu, X. (yuxpointfly@gmail.com), 1600, Science Press (51):
  • [4] CC_TRS: Continuous Clustering of Trajectory Stream Data Based on Micro Cluster Life
    Riyadh, Musaab
    Mustapha, Norwati
    Sulaiman, Md. Nasir
    Sharef, Nurfadhlina Binti Mohd
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2017, 2017
  • [5] Clustering spatial data for join operations using match-based partition
    Xiao, Jitian
    [J]. International Conference on Computational Intelligence for Modelling, Control & Automation Jointly with International Conference on Intelligent Agents, Web Technologies & Internet Commerce, Vol 2, Proceedings, 2006, : 471 - 476
  • [6] A Micro-Cluster-Based Data Stream Clustering Method For P2P Traffic Classification
    Yan, Guanghui
    Ai, Minghao
    [J]. INFORMATION TECHNOLOGY APPLICATIONS IN INDUSTRY, PTS 1-4, 2013, 263-266 : 1121 - 1126
  • [7] Efficient Online Stream Clustering Based on Fast Peeling of Boundary Micro-Cluster
    Sun, Jiarui
    Du, Mingjing
    Sun, Chen
    Dong, Yongquan
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [8] CC_TRS: Continuous Clustering of Trajectory Stream Data Based on Micro Cluster Life (vol 2017, 7523138, 2017)
    Riyadh, Musaab
    Mustapha, Norwati
    Sulaiman, Md. Nasir
    Sharef, Nurfadhlina Binti Mohd
    [J]. MATHEMATICAL PROBLEMS IN ENGINEERING, 2018, 2018
  • [9] Dynamic Micro-cluster-Based Streaming Data Clustering Method for Anomaly Detection
    Wang, Xiaolan
    Ahmed, Md Manjur
    Husen, Mohd Nizam
    Tao, Hai
    Zhao, Qian
    [J]. SOFT COMPUTING IN DATA SCIENCE, SCDS 2023, 2023, 1771 : 61 - 75
  • [10] A fast density-based data stream clustering algorithm with cluster centers self-determined for mixed data
    Chen, Jin-Yin
    He, Hui-Hao
    [J]. INFORMATION SCIENCES, 2016, 345 : 271 - 293