A Comprehensive Empirical Analysis of Data Sets, Regression-Based Feature Selectors, and Linear SVM Classifiers for Intrusion Detection Systems

被引：2

作者：

Azimjonov, Jahongir ^{[1
]}

Kim, Taehong ^{[2
]}

机构：

[1] Andijan State Univ, Dept Informat Technol, Andijan 170100, Uzbekistan

[2] Chungbuk Natl Univ, Sch Informat & Commun Engn, Cheongju 28644, South Korea

来源：

IEEE INTERNET OF THINGS JOURNAL | 2024年 / 11卷 / 21期

基金：

新加坡国家研究基金会;

关键词：

Feature extraction; Accuracy; Support vector machines; Intrusion detection; Internet of Things; Classification algorithms; Surveys; Efficient and relevant features; enhancing Internet of Things (IoT) security; intrusion detection system (IDS) data sets; IDSs; linear classifiers [CSVMLK; linear support vector machine (LSVM); stochastic gradient descent classification (SGDC); regression-based feature selection;

D O I：

10.1109/JIOT.2024.3415499

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Machine learning (ML)-based intrusion detection systems (IDSs) are crucial in safeguarding computer networks against malicious activities. However, building an optimal (accurate and high-performance) ML-based IDS, a combination of data sets, feature selectors, and classifiers, is challenging. This article presents a comprehensive empirical analysis to enhance the effectiveness of IDSs by delving into these three critical components: 1) data sets; 2) feature selection; and 3) classification techniques based on regression models and linear support vector machines (LSVMs), respectively. We begin by evaluating six different data sets commonly used in IDS research, identifying their strengths, limitations, and suitability for real-world scenarios. Next, we explore regression-based feature selectors to identify the most relevant features for intrusion detection, enhancing the accuracy and efficiency of the IDSs. Then, we examine various LSVM classifiers, comparing their performance and highlighting their strengths and weaknesses. By combining these components, this study aims to provide a holistic understanding of the intricate relationship between data sets, regression-based feature selectors, and SVM-based linear classifiers, thus aiding researchers and practitioners in designing more effective and robust IDSs. The empirical analysis conducted in this study employs rigorous evaluation metrics and a comprehensive experimental setup to ensure reliable and unbiased results. The insights gained from our investigation can help guide future research and development efforts toward more efficient and reliable ML-based IDSs.

引用

页码：34676 / 34693

页数：18

共 21 条

[1] Optimizing Artificial Intelligence-aided breast cancer models: An empirical analysis of binary classifiers and regression-based feature selectors
Madolimov, Fakhriddin
Medatov, Asilbek
Nazirova, Elmira
Zaynidinov, Hakimjon
Azimov, Uktam
Turakhonova, Shakhnoza
Azimjonov, Jahongir
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 147
[2] Designing accurate lightweight intrusion detection systems for IoT networks using fine-tuned linear SVM and feature selectors
Azimjonov, Jahongir
Kim, Taehong
COMPUTERS & SECURITY, 2024, 137
[3] Linear regression-based feature selection for microarray data classification
Hasan, Md Abid
Hasan, Md Kamrul
Mottalib, M. Abdul
INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2015, 11 (02) : 167 - 179
[4] Ensemble of binary SVM classifiers based on PCA and LDA feature extraction for intrusion detection
Aburomman, Abdulla Amin
Reaz, Mamun Bin Ibne
PROCEEDINGS OF 2016 IEEE ADVANCED INFORMATION MANAGEMENT, COMMUNICATES, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IMCEC 2016), 2016, : 636 - 640
[5] A comprehensive survey and taxonomy of the SVM-based intrusion detection systems
Mohammadi, Mokhtar
Rashid, Tarik A.
Karim, Sarkhel H. Taher
Aldalwie, Adil Hussain Mohammed
Quan Thanh Tho
Bidaki, Moazam
Rahmani, Amir Masoud
Hosseinzadeh, Mehdi
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2021, 178
[6] Intrusion Detection Systems using Linear Discriminant Analysis and Logistic Regression
Subba, Basant
Biswas, Santosh
Karmakar, Sushanta
2015 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2015,
[7] Empirical Enhancement of Intrusion Detection Systems: A Comprehensive Approach with Genetic Algorithm-based Hyperparameter Tuning and Hybrid Feature Selection
Bakir, Halit
Ceviz, Ozlem
ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2024, 49 (09) : 13025 - 13043
[8] Features vs. attacks: A comprehensive feature selection model for network based intrusion detection systems
Onut, Iosif-Viorel
Ghorbani, Ali A.
INFORMATION SECURITY, PROCEEDINGS, 2007, 4779 : 19 - +
[9] Combining SVM classifiers using genetic fuzzy systems based on AUC for gene expression data analysis
Chen, Xiujuan
Zhao, Yichuan
Zhang, Yan-Qing
Harrison, Robert
BIOINFORMATICS RESEARCH AND APPLICATIONS, PROCEEDINGS, 2007, 4463 : 496 - +
[10] Choosing Decision Tree-Based Boundary Patterns in the Intrusion Detection Systems with Large Data Sets
Ghaffari, Hamidreza
INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2022, 19 (03) : 363 - 369

← 1 2 3 →