An Improved Vulnerability Exploitation Prediction Model with Novel Cost Function and Custom Trained Word Vector Embedding

被引:5
|
作者
Hoque, Mohammad Shamsul [1 ]
Jamil, Norziana [1 ]
Amin, Nowshad [2 ]
Lam, Kwok-Yan [3 ,4 ]
机构
[1] Univ Tenaga Nas, Coll Comp & Informat, Kajang 43000, Malaysia
[2] Univ Tenaga Nas, Inst Sustainable Energy ISE, Renewable Energy & Solar Photovolta, Kajang 43000, Malaysia
[3] Nanyang Technol Univ NTU, Technopreneur Ship Ctr, Sch Comp Sci & Engn, Singapore 639798, Singapore
[4] Nanyang Technol Univ NTU, Singapore 639798, Singapore
关键词
cloud security management; supervised machine learning; modelling and prediction; cost function; vulnerability exploitation prediction;
D O I
10.3390/s21124220
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Successful cyber-attacks are caused by the exploitation of some vulnerabilities in the software and/or hardware that exist in systems deployed in premises or the cloud. Although hundreds of vulnerabilities are discovered every year, only a small fraction of them actually become exploited, thereby there exists a severe class imbalance between the number of exploited and non-exploited vulnerabilities. The open source national vulnerability database, the largest repository to index and maintain all known vulnerabilities, assigns a unique identifier to each vulnerability. Each registered vulnerability also gets a severity score based on the impact it might inflict upon if compromised. Recent research works showed that the cvss score is not the only factor to select a vulnerability for exploitation, and other attributes in the national vulnerability database can be effectively utilized as predictive feature to predict the most exploitable vulnerabilities. Since cybersecurity management is highly resource savvy, organizations such as cloud systems will benefit when the most likely exploitable vulnerabilities that exist in their system software or hardware can be predicted with as much accuracy and reliability as possible, to best utilize the available resources to fix those first. Various existing research works have developed vulnerability exploitation prediction models by addressing the existing class imbalance based on algorithmic and artificial data resampling techniques but still suffer greatly from the overfitting problem to the major class rendering them practically unreliable. In this research, we have designed a novel cost function feature to address the existing class imbalance. We also have utilized the available large text corpus in the extracted dataset to develop a custom-trained word vector that can better capture the context of the local text data for utilization as an embedded layer in neural networks. Our developed vulnerability exploitation prediction models powered by a novel cost function and custom-trained word vector have achieved very high overall performance metrics for accuracy, precision, recall, F1-Score and AUC score with values of 0.92, 0.89, 0.98, 0.94 and 0.97, respectively, thereby outperforming any existing models while successfully overcoming the existing overfitting problem for class imbalance.
引用
收藏
页数:17
相关论文
共 14 条
  • [1] Improved Arabic image captioning model using feature concatenation with pre-trained word embedding
    Elbedwehy, Samar
    Medhat, T.
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (26): : 19051 - 19067
  • [2] Improved Arabic image captioning model using feature concatenation with pre-trained word embedding
    Samar Elbedwehy
    T. Medhat
    Neural Computing and Applications, 2023, 35 : 19051 - 19067
  • [3] A novel custom ensemble learning model for an improved reservoir permeability and water saturation prediction
    Otchere, Daniel Asante
    Ganat, Tarek Omar Arbi
    Gholami, Raoof
    Lawal, Mutari
    JOURNAL OF NATURAL GAS SCIENCE AND ENGINEERING, 2021, 91
  • [4] A novel attention based deep learning model for software defect prediction with bidirectional word embedding system
    M. Chitra Devi
    T. Dhiliphan Rajkumar
    Soft Computing, 2025, 29 (4) : 2171 - 2188
  • [5] A novel single multiplicative neuron model trained by an improved glowworm swarm optimization algorithm for time series prediction
    Cui, Huimin
    Feng, Jianxin
    Guo, Jin
    Wang, Tingfeng
    KNOWLEDGE-BASED SYSTEMS, 2015, 88 : 195 - 209
  • [6] Landslide prediction based on improved principal component analysis and mixed kernel function least squares support vector regression model
    Li Li-min
    Cheng Shao-kang
    Wen Zong-zhou
    JOURNAL OF MOUNTAIN SCIENCE, 2021, 18 (08) : 2130 - 2142
  • [7] Landslide prediction based on improved principal component analysis and mixed kernel function least squares support vector regression model
    LI Li-min
    CHENG Shao-kang
    WEN Zong-zhou
    JournalofMountainScience, 2021, 18 (08) : 2130 - 2142
  • [8] Landslide prediction based on improved principal component analysis and mixed kernel function least squares support vector regression model
    Li-min Li
    Shao-kang Cheng
    Zong-zhou Wen
    Journal of Mountain Science, 2021, 18 : 2130 - 2142
  • [9] Wind power prediction using a novel model on wavelet decomposition-support vector machines-improved atomic search algorithm
    Li, Ling-Ling
    Chang, Yun-Biao
    Tseng, Ming-Lang
    Liu, Jia-Qi
    Lim, Ming K.
    JOURNAL OF CLEANER PRODUCTION, 2020, 270
  • [10] A Novel Hybrid Prediction Model for Hourly Gas Consumption in Supply Side Based on Improved Whale Optimization Algorithm and Relevance Vector Machine
    Qiao, Weibiao
    Huang, Kun
    Azimi, Mohammadamin
    Han, Shuai
    IEEE ACCESS, 2019, 7 : 88218 - 88230