Machine learning-based intrusion detection: feature selection versus feature extraction

被引：3

作者：

Ngo, Vu-Duc ^{[1
,2
]}

Vuong, Tuan-Cuong ^{[3
]}

Van Luong, Thien ^{[3
]}

Tran, Hung ^{[3
]}

机构：

[1] MobiFone Corp, Res & Dev Ctr, Hanoi 11312, Vietnam

[2] Hanoi Univ Sci & Technol, Sch Elect & Elect Engn, Hanoi 11657, Vietnam

[3] Phenikaa Univ, Fac Comp Sci, Hanoi 12116, Vietnam

来源：

CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2024年 / 27卷 / 03期

关键词：

Intrusion detection; UNSW-NB15; Feature selection; Feature extraction; PCA; Machine learning; Internet of Things; Runtime; Binary; multiclass classification; NIDS; IoT; DETECTION MODEL; BLOCKCHAIN; PCA;

D O I：

10.1007/s10586-023-04089-5

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Internet of Things (IoTs) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Table 14 at the end of Sect. 4. Note that such the comparison between feature selection and feature extraction over UNSW-NB15 as well as theoretical guideline have been overlooked in the literature.

引用

页码：2365 / 2379

页数：15

共 50 条

[21] Weighted Feature Selection for Machine Learning Based Accurate Intrusion Detection in Communication Networks
Tripathi, Gaurav
Singh, Vishal Krishna
Sharma, Varun
Vinodbhai, Majithia Vivek
[J]. IEEE ACCESS, 2024, 12 : 20973 - 20982
[22] Network Intrusion Detection Through Machine Learning With Efficient Feature Selection
Desai, Rohan
Gopalakrishnan, Venkatesh Tiruchirai
[J]. 2023 15TH INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS & NETWORKS, COMSNETS, 2023,
[23] Review on intrusion detection using feature selection with machine learning techniques
Kalimuthan, C.
Renjit, J. Arokia
[J]. MATERIALS TODAY-PROCEEDINGS, 2020, 33 : 3794 - 3802
[24] Reviewing various feature selection techniques in machine learning-based botnet detection
Baruah, Sangita
Borah, Dhruba Jyoti
Deka, Vaskar
[J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (12):
[25] Towards Effective Feature Selection in Machine Learning-Based Botnet Detection Approaches
Beigi, Elaheh Biglar
Jazi, Hossein Hadian
Stakhanova, Natalia
Ghorbani, Ali A.
[J]. 2014 IEEE CONFERENCE ON COMMUNICATIONS AND NETWORK SECURITY (CNS), 2014, : 247 - 255
[26] Machine Learning-Based Cardiovascular Disease Detection Using Optimal Feature Selection
Ullah, Tahseen
Ullah, Syed Irfan
Ullah, Khalil
Ishaq, Muhammad
Khan, Ahmad
Ghadi, Yazeed Yasin
Algarni, Abdulmohsen
[J]. IEEE ACCESS, 2024, 12 : 16431 - 16446
[27] Feature Selection For Machine Learning-Based Early Detection of Distributed Cyber Attacks
Feng, Yaokai
Akiyama, Hitoshi
Lu, Liang
Sakurai, Kouichi
[J]. 2018 16TH IEEE INT CONF ON DEPENDABLE, AUTONOM AND SECURE COMP, 16TH IEEE INT CONF ON PERVAS INTELLIGENCE AND COMP, 4TH IEEE INT CONF ON BIG DATA INTELLIGENCE AND COMP, 3RD IEEE CYBER SCI AND TECHNOL CONGRESS (DASC/PICOM/DATACOM/CYBERSCITECH), 2018, : 173 - 180
[28] A Comparison of Feature Selection and Feature Extraction in Network Intrusion Detection Systems
Vuong, Tuan-Cuong
Tran, Hung
Trang, Mai Xuan
Ngo, Vu-Duc
Van Luong, Thien
[J]. PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1798 - 1804
[29] A Fusion of Feature Extraction and Feature Selection Technique for Network Intrusion Detection
Hamid, Yasir
Sugumaran, M.
Journaux, Ludovic
[J]. INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2016, 10 (08): : 151 - 158
[30] Deep learning based latent feature extraction for intrusion detection
Mighan, Soosan Naderi
Kahani, Mohsen
[J]. 26TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE 2018), 2018, : 1511 - 1516

← 1 2 3 4 5 →