A clustering-based approach for classifying data streams using graph matching

被引:0
|
作者
Du, Yuxin [1 ]
He, Mingshu [2 ]
Wang, Xiaojuan [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Elect Engn, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R China
基金
中国国家自然科学基金;
关键词
Coarse-grained clustering; Traffic classification; Graph matching algorithm; Primary features; ENCRYPTED TRAFFIC CLASSIFICATION; NETWORK; SCHEME;
D O I
10.1186/s40537-025-01087-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In response to challenges such as data encryption, uneven distribution, and user privacy concerns in network traffic classification, this paper presents a clustering-based approach.In response to challenges such as data encryption, uneven distribution, and user privacy concerns in network traffic classification, this paper presents a clustering-based approach. The proposed method utilizes a graph matching approach to effectively categorize data streams in real-time scenarios. This approach aims to enhance the accuracy and efficiency of network traffic classification, particularly in the face of evolving encryption techniques and privacy-preserving measures. The method relies solely on non-content features to characterize network flow characteristics and employs graph matching algorithms to reduce inter-class imbalances, enabling coarse-grained clustering and reliable graph matching. Firstly, an unsupervised clustering framework is designed, which studies the diverse distributions and category similarities of traffic data based on a limited set of features. This unsupervised clustering helps mitigate network disparities by aggregating network sessions into a few clusters with extracted primary features. Next, the correlation between clusters from the same network is used to construct a similarity graph. Finally, a graph matching algorithm is proposed, which combines graph neural networks and graph matching networks to reveal reliable correspondences between different network relationships. This allows for associating clusters in the test network with clusters in the initial network, enabling the labeling of test clusters based on associated clusters in the training set. Simulation results demonstrate that the proposed method achieves an accuracy rate of 96.8%, which is significantly superior to existing approaches.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] A Clustering-based Framework for Classifying Data Streams
    Yan, Xuyang
    Homaifar, Abdollah
    Sarkar, Mrinmoy
    Girma, Abenezer
    Tunstel, Edward
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 3257 - 3263
  • [2] Graph clustering-based discretization approach to microarray data
    Kittakorn Sriwanna
    Tossapon Boongoen
    Natthakan Iam-On
    Knowledge and Information Systems, 2019, 60 : 879 - 906
  • [3] Graph clustering-based discretization approach to microarray data
    Sriwanna, Kittakorn
    Boongoen, Tossapon
    Iam-On, Natthakan
    KNOWLEDGE AND INFORMATION SYSTEMS, 2019, 60 (02) : 879 - 906
  • [4] A novel clustering-based approach to schema matching
    Pei, Jin
    Hong, Jun
    Bell, David
    ADVANCES IN INFORMATION SYSTEMS, PROCEEDINGS, 2006, 4243 : 60 - 69
  • [5] Fast clustering-based anonymization algorithm for data streams
    Guo, Kun
    Zhang, Qi-Shan
    Ruan Jian Xue Bao/Journal of Software, 2013, 24 (08): : 1852 - 1867
  • [6] A semi-supervised clustering-based classification model for classifying imbalanced data streams in the presence of scarcely labelled data
    Bhowmick K.
    Narvekar M.
    International Journal of Business Intelligence and Data Mining, 2022, 20 (02) : 170 - 191
  • [7] Fuzzy Clustering-Based Adaptive Regression for Drifting Data Streams
    Song, Yiliao
    Lu, Jie
    Lu, Haiyan
    Zhang, Guangquan
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (03) : 544 - 557
  • [8] Clustering-based approach for medical data classification
    Kodabagi, Mallikarjun M.
    Tikotikar, Ahelam
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (14):
  • [9] A Clustering-Based Approach for Large-Scale Ontology Matching
    Algergawy, Alsayed
    Massmann, Sabine
    Rahm, Erhard
    ADVANCES IN DATABASES AND INFORMATION SYSTEMS, 2011, 6909 : 415 - 428
  • [10] Fast clustering-based anonymization approaches with time constraints for data streams
    Guo, Kun
    Zhang, Qishan
    KNOWLEDGE-BASED SYSTEMS, 2013, 46 : 95 - 108