Improving Multiclass Classification of Cybersecurity Breaches in Railway Infrastructure using Imbalanced Learning

被引:0
|
作者
Nebaba, Aleksandr N. [1 ]
Savvas, Ilias K. [2 ]
Butakova, Maria A. [3 ]
Chernov, Andrey V. [3 ]
Shevchuk, Petr S. [4 ]
机构
[1] Rostov State Transport Univ, Rostov Na Donu, Russia
[2] Univ Thessaly, Sch Technol, Dept Digital Syst, Larisa, Greece
[3] Southern Fed Univ, Smart Mat Res Inst, Rostov Na Donu, Russia
[4] Don State Tech Univ, Rostov Na Donu, Russia
基金
俄罗斯基础研究基金会;
关键词
Multiclass classification; Machine learning; Imbalanced learning; Cybersecurity breaches; Railway infrastructure;
D O I
10.1145/3501774.3501789
中图分类号
学科分类号
摘要
Machine learning approaches and algorithms are spreading in wide areas in research and technology. Cybersecurity breaches are the common anomalies for networked and distributed infrastructures which are monitored, registered, and described carefully. However, the description of each security breaches episode and its classification is still a difficult problem, especially in highly complex telecommunication infrastructure. Railway information infrastructure usually has a large scale and large diversity of possible security breaches. Today's situation shows the registering of the security breaches has a mature and stable character, but the problem of their automated classification is not solved completely. Many studies on security breaches multiclass classification show inadequate accuracy of classification. We investigated the origins of this problem and suggested the possible roots consist in disbalance the datasets used for machine learning multiclass classification. Thus, we proposed an approach to improve the accuracy of the classification and verified our approach on the really collected datasets with cybersecurity breaches in railway telecommunication infrastructure. We analyzed the results of applying three imbalanced learning methodologies, namely random oversampling, synthetic minority oversampling technique, and the last one with Tomek links. We have implemented three machine learning algorithms, namely Naive Bayes, K-means, and support vector machine, on disbalances and balanced data to estimate imbalance learning methodologies with comparing results. The proposed approach demonstrated the increase of the accuracy for multiclass classification in the range from 30 to 41%, depending on the imbalanced learning technique.
引用
收藏
页码:100 / 105
页数:6
相关论文
共 50 条
  • [31] Imbalanced Learning in Land Cover Classification: Improving Minority Classes' Prediction Accuracy Using the Geometric SMOTE Algorithm
    Douzas, Georgios
    Bacao, Fernando
    Fonseca, Joao
    Khudinyan, Manvel
    REMOTE SENSING, 2019, 11 (24)
  • [33] Improving the accuracy of multiclass classification in machine learning: A case study in a cell signaling dataset
    Pablo Gonzalez-Perez, Pedro
    Eduardo Sanchez-Gutierrez, Maximo
    INTELLIGENT DATA ANALYSIS, 2022, 26 (02) : 481 - 500
  • [34] Imbalanced data classification: Using transfer learning and active sampling
    Liu, Yang
    Yang, Guoping
    Qiao, Shaojie
    Liu, Meiqi
    Qu, Lulu
    Han, Nan
    Wu, Tao
    Yuan, Guan
    Peng, Yuzhong
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 117
  • [35] The Imbalanced Classification of Fraudulent Bank Transactions Using Machine Learning
    Ruchay, Alexey
    Feldman, Elena
    Cherbadzhi, Dmitriy
    Sokolov, Alexander
    MATHEMATICS, 2023, 11 (13)
  • [36] Classifying multiclass imbalanced data using generalized class-specific extreme learning machine
    Bhagat Singh Raghuwanshi
    Sanyam Shukla
    Progress in Artificial Intelligence, 2021, 10 : 259 - 281
  • [37] Multiclass Imbalanced Classification Using Fuzzy C-Mean and SMOTE with Fuzzy Support Vector Machine
    Pruengkarn, Ratchakoon
    Wong, Kok Wai
    Fung, Chun Che
    NEURAL INFORMATION PROCESSING, ICONIP 2017, PT V, 2017, 10638 : 67 - 75
  • [38] Classifying multiclass imbalanced data using generalized class-specific extreme learning machine
    Raghuwanshi, Bhagat Singh
    Shukla, Sanyam
    PROGRESS IN ARTIFICIAL INTELLIGENCE, 2021, 10 (03) : 259 - 281
  • [39] Classification of Imbalanced Data Using Deep Learning with Adding Noise
    Fan, Wan-Wei
    Lee, Ching-Hung
    JOURNAL OF SENSORS, 2021, 2021 (2021)
  • [40] Improving multiclass classification using neighborhood search in error correcting output codes
    Eghbali, Niloufar
    Montazer, Gholam Ali
    PATTERN RECOGNITION LETTERS, 2017, 100 : 74 - 82