Machine learning-based intrusion detection: feature selection versus feature extraction

被引:3
|
作者
Ngo, Vu-Duc [1 ,2 ]
Vuong, Tuan-Cuong [3 ]
Van Luong, Thien [3 ]
Tran, Hung [3 ]
机构
[1] MobiFone Corp, Res & Dev Ctr, Hanoi 11312, Vietnam
[2] Hanoi Univ Sci & Technol, Sch Elect & Elect Engn, Hanoi 11657, Vietnam
[3] Phenikaa Univ, Fac Comp Sci, Hanoi 12116, Vietnam
关键词
Intrusion detection; UNSW-NB15; Feature selection; Feature extraction; PCA; Machine learning; Internet of Things; Runtime; Binary; multiclass classification; NIDS; IoT; DETECTION MODEL; BLOCKCHAIN; PCA;
D O I
10.1007/s10586-023-04089-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Internet of Things (IoTs) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Table 14 at the end of Sect. 4. Note that such the comparison between feature selection and feature extraction over UNSW-NB15 as well as theoretical guideline have been overlooked in the literature.
引用
收藏
页码:2365 / 2379
页数:15
相关论文
共 50 条
  • [1] Optimizing IoT intrusion detection system: feature selection versus feature extraction in machine learning
    Li, Jing
    Othman, Mohd Shahizan
    Chen, Hewan
    Yusuf, Lizawati Mi
    [J]. JOURNAL OF BIG DATA, 2024, 11 (01)
  • [2] Optimizing IoT intrusion detection system: feature selection versus feature extraction in machine learning
    Jing Li
    Mohd Shahizan Othman
    Hewan Chen
    Lizawati Mi Yusuf
    [J]. Journal of Big Data, 11
  • [3] Feature extraction for machine learning-based intrusion detection in IoT networks
    Mohanad Sarhan
    Siamak Layeghy
    Nour Moustafa
    Marcus Gallagher
    Marius Portmann
    [J]. Digital Communications and Networks, 2024, 10 (01) - 216
  • [4] Feature extraction for machine learning-based intrusion detection in IoT networks
    Sarhan, Mohanad
    Layeghy, Siamak
    Moustafa, Nour
    Gallagher, Marcus
    Portmann, Marius
    [J]. DIGITAL COMMUNICATIONS AND NETWORKS, 2024, 10 (01) : 205 - 216
  • [5] Automatic Feature Extraction and Selection For Machine Learning Based Intrusion Detection
    Liu, Jinjie
    Chung, Sun Sunnie
    [J]. 2019 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI 2019), 2019, : 1400 - 1405
  • [6] Machine Learning-Based Feature Extraction and Selection
    Ruano-Ordas, David
    [J]. APPLIED SCIENCES-BASEL, 2024, 14 (15):
  • [7] INTRUSION DETECTION BASED ON MACHINE LEARNING AND FEATURE SELECTION
    Alaoui, Souad
    El Gonnouni, Amina
    Lyhyaoui, Abdelouahid
    [J]. MENDEL 2011 - 17TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING, 2011, : 199 - 206
  • [8] A Machine Learning-Based Framework with Enhanced Feature Selection and Resampling for Improved Intrusion Detection
    Malik, Fazila
    Khan, Qazi Waqas
    Rizwan, Atif
    Alnashwan, Rana
    Atteia, Ghada
    [J]. MATHEMATICS, 2024, 12 (12)
  • [9] Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction
    Talukder, Md. Alamin
    Islam, Md. Manowarul
    Uddin, Md Ashraf
    Hasan, Khondokar Fida
    Sharmin, Selina
    Alyami, Salem A.
    Moni, Mohammad Ali
    [J]. JOURNAL OF BIG DATA, 2024, 11 (01)
  • [10] Machine learning-based network intrusion detection for big and imbalanced data using oversampling, stacking feature embedding and feature extraction
    Md. Alamin Talukder
    Md. Manowarul Islam
    Md Ashraf Uddin
    Khondokar Fida Hasan
    Selina Sharmin
    Salem A. Alyami
    Mohammad Ali Moni
    [J]. Journal of Big Data, 11