Improving Multiclass Classification of Cybersecurity Breaches in Railway Infrastructure using Imbalanced Learning

被引：0

作者：

Nebaba, Aleksandr N. ^{[1
]}

Savvas, Ilias K. ^{[2
]}

Butakova, Maria A. ^{[3
]}

Chernov, Andrey V. ^{[3
]}

Shevchuk, Petr S. ^{[4
]}

机构：

[1] Rostov State Transport Univ, Rostov Na Donu, Russia

[2] Univ Thessaly, Sch Technol, Dept Digital Syst, Larisa, Greece

[3] Southern Fed Univ, Smart Mat Res Inst, Rostov Na Donu, Russia

[4] Don State Tech Univ, Rostov Na Donu, Russia

来源：

ESSE 2021: THE 2ND EUROPEAN SYMPOSIUM ON SOFTWARE ENGINEERING | 2021年

基金：

俄罗斯基础研究基金会;

关键词：

Multiclass classification; Machine learning; Imbalanced learning; Cybersecurity breaches; Railway infrastructure;

D O I：

10.1145/3501774.3501789

中图分类号：

学科分类号：

摘要：

Machine learning approaches and algorithms are spreading in wide areas in research and technology. Cybersecurity breaches are the common anomalies for networked and distributed infrastructures which are monitored, registered, and described carefully. However, the description of each security breaches episode and its classification is still a difficult problem, especially in highly complex telecommunication infrastructure. Railway information infrastructure usually has a large scale and large diversity of possible security breaches. Today's situation shows the registering of the security breaches has a mature and stable character, but the problem of their automated classification is not solved completely. Many studies on security breaches multiclass classification show inadequate accuracy of classification. We investigated the origins of this problem and suggested the possible roots consist in disbalance the datasets used for machine learning multiclass classification. Thus, we proposed an approach to improve the accuracy of the classification and verified our approach on the really collected datasets with cybersecurity breaches in railway telecommunication infrastructure. We analyzed the results of applying three imbalanced learning methodologies, namely random oversampling, synthetic minority oversampling technique, and the last one with Tomek links. We have implemented three machine learning algorithms, namely Naive Bayes, K-means, and support vector machine, on disbalances and balanced data to estimate imbalance learning methodologies with comparing results. The proposed approach demonstrated the increase of the accuracy for multiclass classification in the range from 30 to 41%, depending on the imbalanced learning technique.

引用

页码：100 / 105

页数：6

共 50 条

[31] Imbalanced Learning in Land Cover Classification: Improving Minority Classes' Prediction Accuracy Using the Geometric SMOTE Algorithm
Douzas, Georgios
Bacao, Fernando
Fonseca, Joao
Khudinyan, Manvel
REMOTE SENSING, 2019, 11 (24)
[32] Improving transformer failure classification on imbalanced DGA data using data-level techniques and machine learning
Yusoff, Marina (marina998@uitm.edu.my), 2025, 13 : 264 - 277
[33] Improving the accuracy of multiclass classification in machine learning: A case study in a cell signaling dataset
Pablo Gonzalez-Perez, Pedro
Eduardo Sanchez-Gutierrez, Maximo
INTELLIGENT DATA ANALYSIS, 2022, 26 (02) : 481 - 500
[34] Imbalanced data classification: Using transfer learning and active sampling
Liu, Yang
Yang, Guoping
Qiao, Shaojie
Liu, Meiqi
Qu, Lulu
Han, Nan
Wu, Tao
Yuan, Guan
Peng, Yuzhong
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
[35] The Imbalanced Classification of Fraudulent Bank Transactions Using Machine Learning
Ruchay, Alexey
Feldman, Elena
Cherbadzhi, Dmitriy
Sokolov, Alexander
MATHEMATICS, 2023, 11 (13)
[36] Classifying multiclass imbalanced data using generalized class-specific extreme learning machine
Bhagat Singh Raghuwanshi
Sanyam Shukla
Progress in Artificial Intelligence, 2021, 10 : 259 - 281
[37] Multiclass Imbalanced Classification Using Fuzzy C-Mean and SMOTE with Fuzzy Support Vector Machine
Pruengkarn, Ratchakoon
Wong, Kok Wai
Fung, Chun Che
NEURAL INFORMATION PROCESSING, ICONIP 2017, PT V, 2017, 10638 : 67 - 75
[38] Classifying multiclass imbalanced data using generalized class-specific extreme learning machine
Raghuwanshi, Bhagat Singh
Shukla, Sanyam
PROGRESS IN ARTIFICIAL INTELLIGENCE, 2021, 10 (03) : 259 - 281
[39] Classification of Imbalanced Data Using Deep Learning with Adding Noise
Fan, Wan-Wei
Lee, Ching-Hung
JOURNAL OF SENSORS, 2021, 2021 (2021)
[40] Improving multiclass classification using neighborhood search in error correcting output codes
Eghbali, Niloufar
Montazer, Gholam Ali
PATTERN RECOGNITION LETTERS, 2017, 100 : 74 - 82

← 1 2 3 4 5 →