Dealing with Imbalanced Data in Multi-class Network Intrusion Detection Systems Using XGBoost

被引:1
|
作者
AL-Essa, Malik [1 ]
Appice, Annalisa [1 ,2 ]
机构
[1] Univ Bari Aldo Moro, Dipartimento Informat, Via Orabona 4, I-70126 Bari, Italy
[2] Consorzio Interuniv Nazl Informat CINI, Bari, Italy
关键词
Network intrusion detection; Imbalanced classification; Oversampling; Feature selection; Multi-class classification;
D O I
10.1007/978-3-030-93733-1_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Network intrusion detection is a crucial cyber-security problem, where machine learning is recognised as a relevant approach to detect signs of malicious activity in the network traffic. However, intrusion detection patterns learned with imbalanced network traffic data often fail in recognizing rare attacks. One way to address this issue is to use oversampling before learning, in order to adjust the ratio between the different classes and make the traffic data more balanced. This paper investigates the effect of oversampling coupled to feature selection, in order to understand how the feature relevance may change due to the creation of artificial rare samples. We perform this study using XGBoost for the network traffic classification. The experiments are performed with two benchmark multi-class network intrusion detection problems.
引用
收藏
页码:5 / 21
页数:17
相关论文
共 50 条
  • [21] Hybrid resampling and weighted majority voting for multi-class anomaly detection on imbalanced malware and network traffic data
    Xue, Liang
    Zhu, Tianqing
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 128
  • [22] MULTI-CLASS DATA CLASSIFICATION FOR IMBALANCED DATA SET USING COMBINED SAMPLING APPROACHES
    Prachuabsupakij, Wanthanee
    Snonthornphisaj, Nuanwan
    KDIR 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND INFORMATION RETRIEVAL, 2011, : 166 - 171
  • [23] Sentiment Classification from Multi-class Imbalanced Twitter Data Using Binarization
    Krawczyk, Bartosz
    McInnes, Bridget T.
    Cano, Alberto
    HYBRID ARTIFICIAL INTELLIGENT SYSTEMS, HAIS 2017, 2017, 10334 : 26 - 37
  • [24] Intrusion detection system based on multi-class SVM
    Lee, H
    Song, J
    Park, D
    ROUGH SETS, FUZZY SETS, DATA MINING, AND GRANULAR COMPUTING, PT 2, PROCEEDINGS, 2005, 3642 : 511 - 519
  • [25] Resampling imbalanced data for network intrusion detection datasets
    Bagui, Sikha
    Li, Kunqi
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [26] Resampling imbalanced data for network intrusion detection datasets
    Sikha Bagui
    Kunqi Li
    Journal of Big Data, 8
  • [27] Learning from Combination of Data Chunks for Multi-class Imbalanced Data
    Liu, Xu-Ying
    Li, Qian-Qian
    PROCEEDINGS OF THE 2014 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2014, : 1680 - 1687
  • [28] AUC Evaluation of Multi-class Classifier Performance in Imbalanced Data
    Ni, Huangjing
    Wang, Wei
    2010 INTERNATIONAL CONFERENCE ON FUTURE CONTROL AND AUTOMATION (ICFCA 2010), 2010, : 48 - 51
  • [29] Efficient DANNLO classifier for multi-class imbalanced data on Hadoop
    Satyanarayana S.
    Tayar Y.
    Prasad R.S.R.
    International Journal of Information Technology, 2019, 11 (2) : 321 - 329
  • [30] Selecting local ensembles for multi-class imbalanced data classification
    Krawczyk, Bartosz
    Cano, Alberto
    Wozniak, Michal
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,