Impact of Minority Class Variability on Anomaly Detection by Means of Random Forests and Support Vector Machines

被引:0
|
作者
Saleem Alraddadi, Faisal [1 ]
Lago-Fernandez, Luis F. [1 ]
Rodriguez, Francisco B. [1 ]
机构
[1] Univ Autonoma Madrid, Escuela Politecn Super, Dept Ingn Informat, Grp Neurocomp Biol, Madrid 28049, Spain
关键词
Security and privacy; UNSW-NB15; dataset; CICIDS2017; Severe imbalance; Cyberattacks; Detection rate; False alarm rate; CLASSIFICATION;
D O I
10.1007/978-3-030-85099-9_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increased connectivity of our world has resulted in a drastic rise of cyberattacks. This has created a dire need for improved security methods that can protect data. Many techniques and technologies have been developed to meet security and privacy demands. Machine learning algorithms are one of such techniques that can be used to detect cyberattacks. In a real network, the attacks represent only a small fraction of the traffic and, therefore, these events can be considered as an anomaly. This article discusses how the anomaly ratio affects results such as the accuracy, the recall, the true positive rate, or the false positive rate when machine learning algorithms are used to detect cyberattacks. Two different algorithms, Random Forests and Support Vector Machines, and two datasets, UNSW-NB15 and CICIDS-2017, are used to carry out this study. We observe that class imbalance affects each algorithm in a very different way. While SVMs fail to recognize the anomalies with acceptable accuracy, RFs seem to be more robust against class imbalance, although in cases of extreme anomaly the detection begins to deteriorate in a similar way. It is, therefore, necessary to investigate new methodologies that solve the problem of detecting attacks when their proportion is very small, and even when this proportion can change dynamically over time.
引用
收藏
页码:416 / 428
页数:13
相关论文
共 50 条
  • [1] Predictive mapping of aquatic ecosystems by means of support vector machines and random forests
    Martinez-Santos, P.
    Aristizabal, H. F.
    Diaz-Alcaide, S.
    Gomez-Escalonilla, V
    [J]. JOURNAL OF HYDROLOGY, 2021, 595
  • [2] Support vector machines for anomaly detection
    Zhang, Xueqin
    Gu, Chunhua
    Lin, Jiajun
    [J]. WCICA 2006: SIXTH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION, VOLS 1-12, CONFERENCE PROCEEDINGS, 2006, : 2594 - +
  • [3] ONE-CLASS SUPPORT VECTOR MACHINES APPROACH TO ANOMALY DETECTION
    Hejazi, Maryamsadat
    Singh, Yashwant Prasad
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2013, 27 (05) : 351 - 366
  • [4] Local density one-class support vector machines for anomaly detection
    Jiang Tian
    Hong Gu
    Chiyang Gao
    Jie Lian
    [J]. Nonlinear Dynamics, 2011, 64 : 127 - 130
  • [5] Local density one-class support vector machines for anomaly detection
    Tian, Jiang
    Gu, Hong
    Gao, Chiyang
    Lian, Jie
    [J]. NONLINEAR DYNAMICS, 2011, 64 (1-2) : 127 - 130
  • [6] Anomaly detection using support vector machines
    Tian, SF
    Yu, J
    Yin, CH
    [J]. ADVANCES IN NEURAL NETWORKS - ISNN 2004, PT 1, 2004, 3173 : 592 - 597
  • [7] Incorporating User Feedback Into One-Class Support Vector Machines for Anomaly Detection
    Lesouple, Julien
    Tourneret, Jean-Yves
    [J]. 28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 1608 - 1612
  • [8] Anomaly detection based on feature selection and multi-class support vector machines
    Zhang, Xiao-Hui
    Lin, Bo-Gang
    [J]. Tongxin Xuebao/Journal on Communications, 2009, 30 (10 A): : 68 - 73
  • [9] Predicting siRNA potency with random forests and support vector machines
    Liangjiang Wang
    Caiyan Huang
    Jack Y Yang
    [J]. BMC Genomics, 11
  • [10] Predicting siRNA potency with random forests and support vector machines
    Wang, Liangjiang
    Huang, Caiyan
    Yang, Jack Y.
    [J]. BMC GENOMICS, 2010, 11