A Machine Learning-Based Framework with Enhanced Feature Selection and Resampling for Improved Intrusion Detection

被引:0
|
作者
Malik, Fazila [1 ]
Khan, Qazi Waqas [2 ]
Rizwan, Atif [2 ]
Alnashwan, Rana [3 ]
Atteia, Ghada [3 ]
机构
[1] Iqra Univ Islamabad, Dept Comp Sci, Islamabad 44000, Pakistan
[2] Jeju Natl Univ, Dept Comp Engn, Jejusi 63243, South Korea
[3] Princess Nourah bint Abdulrahman Univ, Coll Comp & Informat Sci, Dept Informat Technol, POB 84428, Riyadh 11671, Saudi Arabia
关键词
feature selection; data resampling; intrusion detection; applied machine learning; deep learning; INTERNET;
D O I
10.3390/math12121799
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Intrusion Detection Systems (IDSs) play a crucial role in safeguarding network infrastructures from cyber threats and ensuring the integrity of highly sensitive data. Conventional IDS technologies, although successful in achieving high levels of accuracy, frequently encounter substantial model bias. This bias is primarily caused by imbalances in the data and the lack of relevance of certain features. This study aims to tackle these challenges by proposing an advanced machine learning (ML) based IDS that minimizes misclassification errors and corrects model bias. As a result, the predictive accuracy and generalizability of the IDS are significantly improved. The proposed system employs advanced feature selection techniques, such as Recursive Feature Elimination (RFE), sequential feature selection (SFS), and statistical feature selection, to refine the input feature set and minimize the impact of non-predictive attributes. In addition, this work incorporates data resampling methods such as Synthetic Minority Oversampling Technique and Edited Nearest Neighbor (SMOTE_ENN), Adaptive Synthetic Sampling (ADASYN), and Synthetic Minority Oversampling Technique-Tomek Links (SMOTE_Tomek) to address class imbalance and improve the accuracy of the model. The experimental results indicate that our proposed model, especially when utilizing the random forest (RF) algorithm, surpasses existing models regarding accuracy, precision, recall, and F Score across different data resampling methods. Using the ADASYN resampling method, the RF model achieves an accuracy of 99.9985% for botnet attacks and 99.9777% for Man-in-the-Middle (MITM) attacks, demonstrating the effectiveness of our approach in dealing with imbalanced data distributions. This research not only improves the abilities of IDS to identify botnet and MITM attacks but also provides a scalable and efficient solution that can be used in other areas where data imbalance is a recurring problem. This work has implications beyond IDS, offering valuable insights into using ML techniques in complex real-world scenarios.
引用
收藏
页数:25
相关论文
共 50 条
  • [41] MANET: A SURVEY ON MACHINE LEARNING-BASED INTRUSION DETECTION APPROACHES
    Laqtib, Safaa
    El Yassini, Khalid
    Hasnaoui, Moulay Lahcen
    INTERNATIONAL JOURNAL OF FUTURE GENERATION COMMUNICATION AND NETWORKING, 2019, 12 (02): : 55 - 70
  • [42] Machine Learning-Based Intrusion Detection System For Healthcare Data
    Balyan, Amit Kumar
    Ahuja, Sachin
    Sharma, Sanjeev Kumar
    Lilhore, Umesh Kumar
    PROCEEDINGS OF 3RD IEEE CONFERENCE ON VLSI DEVICE, CIRCUIT AND SYSTEM (IEEE VLSI DCS 2022), 2022, : 290 - 294
  • [43] A Survey of Machine Learning-based loT Intrusion Detection Techniques
    Long, Jing
    Fang, Fei
    Luo, Haibo
    2021 IEEE 6TH INTERNATIONAL CONFERENCE ON SMART CLOUD (SMARTCLOUD 2021), 2021, : 7 - 12
  • [44] Improved Crow Search-Based Feature Selection and Ensemble Learning for IoT Intrusion Detection
    Jayalatchumy, D.
    Ramalingam, Rajakumar
    Balakrishnan, Aravind
    Safran, Mejdl
    Alfarhood, Sultan
    IEEE ACCESS, 2024, 12 : 33218 - 33235
  • [45] Machine learning-based intrusion detection for SCADA systems in healthcare
    Tolgahan Öztürk
    Zeynep Turgut
    Gökçe Akgün
    Cemal Köse
    Network Modeling Analysis in Health Informatics and Bioinformatics, 2022, 11
  • [46] Research on Feature Selection of Intrusion Detection Based on Deep Learning
    Xin, Mingyuan
    Wang, Yong
    2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 1431 - 1434
  • [47] Siamese Network Based Feature Learning for Improved Intrusion Detection
    Jmila, Houda
    Ibn Khedher, Mohamed
    Blanc, Gregory
    El Yacoubi, Mounim A.
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I, 2019, 11953 : 377 - 389
  • [48] MAFSIDS: a reinforcement learning-based intrusion detection model for multi-agent feature selection networks
    Ren, Kezhou
    Zeng, Yifan
    Zhong, Yuanfu
    Sheng, Biao
    Zhang, Yingchao
    JOURNAL OF BIG DATA, 2023, 10 (01)
  • [49] A novel framework approach for intrusion detection based on improved critical feature selection in Internet of Things networks
    Siddharthan, Hariprasad
    Thangavel, Deepa
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (01):
  • [50] MAFSIDS: a reinforcement learning-based intrusion detection model for multi-agent feature selection networks
    Kezhou Ren
    Yifan Zeng
    Yuanfu Zhong
    Biao Sheng
    Yingchao Zhang
    Journal of Big Data, 10