Feature selection for optimizing traffic classification

被引:95
|
作者
Zhang, Hongli [1 ]
Lu, Gang [1 ]
Qassrawi, Mahmoud T. [1 ]
Zhang, Yu [1 ]
Yu, Xiangzhan [1 ]
机构
[1] Harbin Inst Technol, Sch Comp Sci & Technol, Harbin 150001, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature selection; Traffic classification; Class imbalance; Robust features; IDENTIFICATION;
D O I
10.1016/j.comcom.2012.04.012
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning (ML) algorithms have been widely applied in recent traffic classification. However, due to the imbalance in the number of traffic flows, ML based classifiers are prone to misclassify flows as the traffic type that occupies the majority of flows on the Internet. To address the problem, a novel feature selection metric named Weighted Symmetrical Uncertainty (WSU) is proposed. We design a hybrid feature selection algorithm named WSU_AUC, which prefilters most of features with WSU metric and further uses a wrapper method to select features for a specific classifier with Area Under roc Curve (AUC) metric. Additionally, to overcome the impacts of dynamic traffic flows on feature selection, we propose an algorithm named SRSF that Selects the Robust and Stable Features from the results achieved by WSU_AUC. We evaluate our approaches using three classifiers on the traces captured from entirely different networks. Experimental results obtained by our algorithms are promising in terms of true positive rate (TPR) and false positive rate (FPR). Moreover, our algorithms can achieve >94% flow accuracy and >80% byte accuracy on average. (c) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:1457 / 1471
页数:15
相关论文
共 50 条
  • [31] A machine learning approach for feature selection traffic classification using security analysis
    Muhammad Shafiq
    Xiangzhan Yu
    Ali Kashif Bashir
    Hassan Nazeer Chaudhry
    Dawei Wang
    [J]. The Journal of Supercomputing, 2018, 74 : 4867 - 4892
  • [32] Novel feature selection and classification of Internet video traffic based on a hierarchical scheme
    Dong, Yu-ning
    Zhao, Jia-jie
    Jin, Jiong
    [J]. COMPUTER NETWORKS, 2017, 119 : 102 - 111
  • [33] Performance evaluation of feature selection and tree-based algorithms for traffic classification
    Aouedi, Ons
    Piamrat, Kandaraj
    Parrein, Benoit
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS (ICC WORKSHOPS), 2021,
  • [34] Internet Traffic Classification based on Min-Max Ensemble Feature Selection
    Huang, Yinxiang
    Li, Yun
    Qiang, Baohua
    [J]. 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3485 - 3492
  • [35] Principal Feature Selection Impact for Internet Traffic Classification Using Naive Bayes
    Paramita, Adi Suryaputra
    [J]. PROCEEDINGS OF SECOND INTERNATIONAL CONFERENCE ON ELECTRICAL SYSTEMS, TECHNOLOGY AND INFORMATION 2015 (ICESTI 2015), 2016, 365 : 475 - 480
  • [36] Feature Selection Technique Impact for Internet Traffic Classification Using Naive Bayesian
    Antonio, Tony
    Paramita, Adi Suryaputra
    [J]. JURNAL TEKNOLOGI, 2015, 72 (05):
  • [37] COMP 313-Optimizing drug classification by feature selection: To bind or not to bind that is the question
    Riedesel, Henning
    Knapp, E. W.
    [J]. ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2008, 236
  • [38] Optimizing Neural Networks for Academic Performance Classification Using Feature Selection and Resampling Approach
    Supriyadi D.
    Purwanto P.
    Warsito B.
    [J]. Mendel, 2023, 29 (02) : 261 - 272
  • [39] Optimizing Feature Selection and Oversampling Using Metaheuristic Algorithms for Binary Fraud Detection Classification
    Biltawi, Mariam M.
    Qaddoura, Raneem
    Faris, Hossam
    [J]. ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2023, PT I, 2023, 675 : 452 - 462
  • [40] A Feature Selection Method for Large-Scale Network Traffic Classification Based on Spark
    Wang, Yong
    Ke, Wenlong
    Tao, Xiaoling
    [J]. INFORMATION, 2016, 7 (01)