A clustering-based approach for classifying data streams using graph matching

被引:0
|
作者
Du, Yuxin [1 ]
He, Mingshu [2 ]
Wang, Xiaojuan [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Elect Engn, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Sch Cyberspace Secur, Beijing 100876, Peoples R China
基金
中国国家自然科学基金;
关键词
Coarse-grained clustering; Traffic classification; Graph matching algorithm; Primary features; ENCRYPTED TRAFFIC CLASSIFICATION; NETWORK; SCHEME;
D O I
10.1186/s40537-025-01087-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In response to challenges such as data encryption, uneven distribution, and user privacy concerns in network traffic classification, this paper presents a clustering-based approach.In response to challenges such as data encryption, uneven distribution, and user privacy concerns in network traffic classification, this paper presents a clustering-based approach. The proposed method utilizes a graph matching approach to effectively categorize data streams in real-time scenarios. This approach aims to enhance the accuracy and efficiency of network traffic classification, particularly in the face of evolving encryption techniques and privacy-preserving measures. The method relies solely on non-content features to characterize network flow characteristics and employs graph matching algorithms to reduce inter-class imbalances, enabling coarse-grained clustering and reliable graph matching. Firstly, an unsupervised clustering framework is designed, which studies the diverse distributions and category similarities of traffic data based on a limited set of features. This unsupervised clustering helps mitigate network disparities by aggregating network sessions into a few clusters with extracted primary features. Next, the correlation between clusters from the same network is used to construct a similarity graph. Finally, a graph matching algorithm is proposed, which combines graph neural networks and graph matching networks to reveal reliable correspondences between different network relationships. This allows for associating clusters in the test network with clusters in the initial network, enabling the labeling of test clusters based on associated clusters in the training set. Simulation results demonstrate that the proposed method achieves an accuracy rate of 96.8%, which is significantly superior to existing approaches.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] A lightweight clustering-based approach to discover different emotional shades from social message streams
    Di Martino, Ferdinando
    Senatore, Sabrina
    Sessa, Salvatore
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2019, 34 (07) : 1505 - 1523
  • [42] A novel clustering-based anonymization approach for graph to achieve Privacy Preservation in Social Network
    Jiang, Huowen
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCES IN MECHANICAL ENGINEERING AND INDUSTRIAL INFORMATICS, 2015, 15 : 545 - 549
  • [43] AEDS-IoT: Adaptive clustering-based Event Detection Scheme for IoT data streams
    Raut, Ashwin
    Shivhare, Anubhav
    Chaurasiya, Vijay Kumar
    Kumar, Manish
    INTERNET OF THINGS, 2023, 22
  • [44] OpenK: An Elastic Data Cleansing System with A Clustering-based Data Anomaly Detection Approach
    Tran Khanh Dang
    Dinh Khuong Nguyen
    Luc Minh Tuan
    2021 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND APPLICATIONS (ACOMP 2021), 2021, : 120 - 127
  • [45] A Micro-Cluster based Ensemble Approach for Classifying Distributed Data Streams
    Mao, Guojun
    Yang, Yi
    2011 23RD IEEE INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2011), 2011, : 753 - 759
  • [46] An evolving approach to data streams clustering based on typicality and eccentricity data analytics
    Bezerra, Clauber Gomes
    Jales Costa, Bruno Sielly
    Guedes, Luiz Affonso
    Angelov, Plamen Parvanov
    INFORMATION SCIENCES, 2020, 518 : 13 - 28
  • [47] Building categorization revisited: A clustering-based approach to using smart meter data for building energy benchmarking
    Zhan, Sicheng
    Liu, Zhaoru
    Chong, Adrian
    Yan, Da
    APPLIED ENERGY, 2020, 269
  • [48] A personalized clustering-based approach using open linked data for search space reduction in recommender systems
    da Costa, Arthur F.
    D'Addio, Rafael M.
    Fressato, Eduardo P.
    Manzato, Marcelo G.
    WEBMEDIA 2019: PROCEEDINGS OF THE 25TH BRAZILLIAN SYMPOSIUM ON MULTIMEDIA AND THE WEB, 2019, : 409 - 416
  • [49] Latent Fingerprints Segmentation: Feasibility of Using Clustering-Based Automated Approach
    Arshad, Irfan
    Raja, Gulistan
    Khan, Ahmad Khalil
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2014, 39 (11) : 7933 - 7944
  • [50] Missing value imputation using a fuzzy clustering-based EM approach
    Rahman, Md. Geaur
    Islam, Md Zahidul
    KNOWLEDGE AND INFORMATION SYSTEMS, 2016, 46 (02) : 389 - 422