A Machine Learning-Based Framework with Enhanced Feature Selection and Resampling for Improved Intrusion Detection

被引：0

作者：

Malik, Fazila ^{[1
]}

Khan, Qazi Waqas ^{[2
]}

Rizwan, Atif ^{[2
]}

Alnashwan, Rana ^{[3
]}

Atteia, Ghada ^{[3
]}

机构：

[1] Iqra Univ Islamabad, Dept Comp Sci, Islamabad 44000, Pakistan

[2] Jeju Natl Univ, Dept Comp Engn, Jejusi 63243, South Korea

[3] Princess Nourah bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Technol, POB 84428, Riyadh 11671, Saudi Arabia

来源：

MATHEMATICS | 2024年 / 12卷 / 12期

关键词：

feature selection; data resampling; intrusion detection; applied machine learning; deep learning; INTERNET;

D O I：

10.3390/math12121799

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Intrusion Detection Systems (IDSs) play a crucial role in safeguarding network infrastructures from cyber threats and ensuring the integrity of highly sensitive data. Conventional IDS technologies, although successful in achieving high levels of accuracy, frequently encounter substantial model bias. This bias is primarily caused by imbalances in the data and the lack of relevance of certain features. This study aims to tackle these challenges by proposing an advanced machine learning (ML) based IDS that minimizes misclassification errors and corrects model bias. As a result, the predictive accuracy and generalizability of the IDS are significantly improved. The proposed system employs advanced feature selection techniques, such as Recursive Feature Elimination (RFE), sequential feature selection (SFS), and statistical feature selection, to refine the input feature set and minimize the impact of non-predictive attributes. In addition, this work incorporates data resampling methods such as Synthetic Minority Oversampling Technique and Edited Nearest Neighbor (SMOTE_ENN), Adaptive Synthetic Sampling (ADASYN), and Synthetic Minority Oversampling Technique-Tomek Links (SMOTE_Tomek) to address class imbalance and improve the accuracy of the model. The experimental results indicate that our proposed model, especially when utilizing the random forest (RF) algorithm, surpasses existing models regarding accuracy, precision, recall, and F Score across different data resampling methods. Using the ADASYN resampling method, the RF model achieves an accuracy of 99.9985% for botnet attacks and 99.9777% for Man-in-the-Middle (MITM) attacks, demonstrating the effectiveness of our approach in dealing with imbalanced data distributions. This research not only improves the abilities of IDS to identify botnet and MITM attacks but also provides a scalable and efficient solution that can be used in other areas where data imbalance is a recurring problem. This work has implications beyond IDS, offering valuable insights into using ML techniques in complex real-world scenarios.

引用

页数：25

共 50 条

[41] MANET: A SURVEY ON MACHINE LEARNING-BASED INTRUSION DETECTION APPROACHES
Laqtib, Safaa
El Yassini, Khalid
Hasnaoui, Moulay Lahcen
INTERNATIONAL JOURNAL OF FUTURE GENERATION COMMUNICATION AND NETWORKING, 2019, 12 (02): : 55 - 70
[42] Machine Learning-Based Intrusion Detection System For Healthcare Data
Balyan, Amit Kumar
Ahuja, Sachin
Sharma, Sanjeev Kumar
Lilhore, Umesh Kumar
PROCEEDINGS OF 3RD IEEE CONFERENCE ON VLSI DEVICE, CIRCUIT AND SYSTEM (IEEE VLSI DCS 2022), 2022, : 290 - 294
[43] A Survey of Machine Learning-based loT Intrusion Detection Techniques
Long, Jing
Fang, Fei
Luo, Haibo
2021 IEEE 6TH INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2021), 2021, : 7 - 12
[44] Improved Crow Search-Based Feature Selection and Ensemble Learning for IoT Intrusion Detection
Jayalatchumy, D.
Ramalingam, Rajakumar
Balakrishnan, Aravind
Safran, Mejdl
Alfarhood, Sultan
IEEE ACCESS, 2024, 12 : 33218 - 33235
[45] Machine learning-based intrusion detection for SCADA systems in healthcare
Tolgahan Öztürk
Zeynep Turgut
Gökçe Akgün
Cemal Köse
Network Modeling Analysis in Health Informatics and Bioinformatics, 2022, 11
[46] Research on Feature Selection of Intrusion Detection Based on Deep Learning
Xin, Mingyuan
Wang, Yong
2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 1431 - 1434
[47] Siamese Network Based Feature Learning for Improved Intrusion Detection
Jmila, Houda
Ibn Khedher, Mohamed
Blanc, Gregory
El Yacoubi, Mounim A.
NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I, 2019, 11953 : 377 - 389
[48] MAFSIDS: a reinforcement learning-based intrusion detection model for multi-agent feature selection networks
Ren, Kezhou
Zeng, Yifan
Zhong, Yuanfu
Sheng, Biao
Zhang, Yingchao
JOURNAL OF BIG DATA, 2023, 10 (01)
[49] A novel framework approach for intrusion detection based on improved critical feature selection in Internet of Things networks
Siddharthan, Hariprasad
Thangavel, Deepa
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (01):
[50] MAFSIDS: a reinforcement learning-based intrusion detection model for multi-agent feature selection networks
Kezhou Ren
Yifan Zeng
Yuanfu Zhong
Biao Sheng
Yingchao Zhang
Journal of Big Data, 10

← 1 2 3 4 5 →