Investigating rarity in web attacks with ensemble learners

被引:0
|
作者
Richard Zuech
John Hancock
Taghi M. Khoshgoftaar
机构
[1] Florida Atlantic University,
来源
关键词
Rarity; CSE-CIC-IDS2018; Intrusion detection; Web attacks; Class imbalance; Random undersampling; Big data; Ensemble learners;
D O I
暂无
中图分类号
学科分类号
摘要
Class rarity is a frequent challenge in cybersecurity. Rarity occurs when the positive (attack) class only has a small number of instances for machine learning classifiers to train upon, thus making it difficult for the classifiers to discriminate and learn from the positive class. To investigate rarity, we examine three individual web attacks in big data from the CSE-CIC-IDS2018 dataset: “Brute Force-Web”, “Brute Force-XSS”, and “SQL Injection”. These three individual web attacks are also severely imbalanced, and so we evaluate whether random undersampling (RUS) treatments can improve the classification performance for these three individual web attacks. The following eight different levels of RUS ratios are evaluated: no sampling, 999:1, 99:1, 95:5, 9:1, 3:1, 65:35, and 1:1. For measuring classification performance, Area Under the Receiver Operating Characteristic Curve (AUC) metrics are obtained for the following seven different classifiers: Random Forest (RF), CatBoost (CB), LightGBM (LGB), XGBoost (XGB), Decision Tree (DT), Naive Bayes (NB), and Logistic Regression (LR) (with the first four learners being ensemble learners and for comparison, the last three being single learners). We find that applying random undersampling does improve overall classification performance with the AUC metric in a statistically significant manner. Ensemble learners achieve the top AUC scores after massive undersampling is applied, but the ensemble learners break down and have poor performance (worse than NB and DT) when no sampling is applied to our unique and harsh experimental conditions of severe class imbalance and rarity.
引用
收藏
相关论文
共 50 条
  • [1] Investigating rarity in web attacks with ensemble learners
    Zuech, Richard
    Hancock, John
    Khoshgoftaar, Taghi M.
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [2] Detecting web attacks using random undersampling and ensemble learners
    Richard Zuech
    John Hancock
    Taghi M. Khoshgoftaar
    Journal of Big Data, 8
  • [3] Detecting web attacks using random undersampling and ensemble learners
    Zuech, Richard
    Hancock, John
    Khoshgoftaar, Taghi M.
    JOURNAL OF BIG DATA, 2021, 8 (01)
  • [4] Detecting SQL Injection Web Attacks Using Ensemble Learners and Data Sampling
    Zuech, Richard
    Hancock, John
    Khoshgoftaar, Taghi M.
    PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE (IEEE CSR), 2021, : 27 - 34
  • [5] Energy Load Forecasting: Investigating Mid-Term Predictions with Ensemble Learners
    Liapis, Charalampos M.
    Karanikola, Aikaterini
    Kotsiantis, Sotiris
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2022, PART I, 2022, 646 : 343 - 355
  • [6] WEBRR: A Forensic System for Replaying and Investigating Web-Based Attacks in The Modern Web
    Allen, Joey
    Yang, Zheng
    Xiao, Feng
    Landen, Matthew
    Perdisci, Roberto
    Lee, Wenke
    PROCEEDINGS OF THE 33RD USENIX SECURITY SYMPOSIUM, SECURITY 2024, 2024, : 1669 - 1686
  • [7] EL-RFHC: Optimized ensemble learners using RFHC for intrusion attacks classification
    Kuppusamy, P.
    Kapadia, Dev
    Manvitha, Edaboina Godha
    Dhahbi, Sami
    Iwendi, C.
    Khan, M. Ijaz
    Mohanty, Sachi Nandan
    Ben Khedher, Nidhal
    AIN SHAMS ENGINEERING JOURNAL, 2024, 15 (07)
  • [8] Mining with Rarity for Web Intelligence
    Gui, Yijie
    Gan, Wensheng
    Chen, Yao
    Wu, Yongdong
    COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 973 - 981
  • [9] Investigating class rarity in big data
    Tawfiq Hasanin
    Taghi M. Khoshgoftaar
    Joffrey L. Leevy
    Richard A. Bauder
    Journal of Big Data, 7
  • [10] Investigating the use of inquiry & web-based activities with inclusive biology learners
    Bodzin, Alec M.
    Waller, Patricia L.
    Santoro, Lana Edwards
    Kale, Darlene
    AMERICAN BIOLOGY TEACHER, 2007, 69 (05): : 273 - 279