A tree-based machine learning methodology to automatically classify software vulnerabilities

被引:4
|
作者
Aivatoglou, Georgios [1 ]
Anastasiadis, Mike [1 ]
Spanos, Georgios [1 ]
Voulgaridis, Antonis [1 ]
Votis, Konstantinos [1 ]
Tzovaras, Dimitrios [1 ]
机构
[1] Informat Technol Inst, Ctr Res & Technol Hellas, Thessaloniki, Greece
基金
欧盟地平线“2020”;
关键词
Software Vulnerability categorization; Cyber-security; Machine Learning; Decision Trees; Random Forests; Gradient Boosting;
D O I
10.1109/CSR51186.2021.9527965
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Software vulnerabilities have become a major problem for the security analysts, since the number of new vulnerabilities is constantly growing. Thus, there was a need for a categorization system, in order to group and handle these vulnerabilities in a more efficient way. Hence, the MITRE corporation introduced the Common Weakness Enumeration that is a list of the most common software and hardware vulnerabilities. However, the manual task of understanding and analyzing new vulnerabilities by security experts, is a very slow and exhausting process. For this reason, a new automated classification methodology is introduced in this paper, based on the vulnerability textual descriptions from National Vulnerability Database. The proposed methodology, combines textual analysis and tree-based machine learning techniques in order to classify vulnerabilities automatically. The results of the experiments showed that the proposed methodology performed pretty well achieving an overall accuracy close to 80%.
引用
收藏
页码:312 / 317
页数:6
相关论文
共 50 条
  • [1] On the Efficacy and Vulnerabilities of Logic Locking in Tree-Based Machine Learning
    de Abreu, Brunno Alves
    Paim, Guilherme
    Alrahis, Lilas
    Flores, Paulo
    Sinanoglu, Ozgur
    Bampi, Sergio
    Amrouch, Hussam
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2025, 72 (01) : 180 - 191
  • [2] On using machine learning to automatically classify software applications into domain categories
    Linares-Vasquez, Mario
    McMillan, Collin
    Poshyvanyk, Denys
    Grechanik, Mark
    EMPIRICAL SOFTWARE ENGINEERING, 2014, 19 (03) : 582 - 618
  • [3] On using machine learning to automatically classify software applications into domain categories
    Mario Linares-Vásquez
    Collin McMillan
    Denys Poshyvanyk
    Mark Grechanik
    Empirical Software Engineering, 2014, 19 : 582 - 618
  • [4] Automatically Detect Software Security Vulnerabilities Based on Natural Language Processing Techniques and Machine Learning Algorithms
    Cho Do Xuan
    Vu Ngoc Son
    Duong Duc
    JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2022, 16 (01) : 70 - 88
  • [5] Protein pKa Prediction by Tree-Based Machine Learning
    Chen, Ada Y.
    Lee, Juyong
    Damjanovic, Ana
    Brooks, Bernard R.
    JOURNAL OF CHEMICAL THEORY AND COMPUTATION, 2022, 18 (04) : 2673 - 2686
  • [6] Tree-based interpretable machine learning of the thermodynamic phases
    Yang, Jintao
    Cao, Junpeng
    PHYSICS LETTERS A, 2021, 412
  • [7] Runtime Optimizations for Tree-based Machine Learning Models
    Asadi, Nima
    Lin, Jimmy
    de Vries, Arjen P.
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014, 26 (09) : 2281 - 2292
  • [8] Comparison of Machine Learning Methods to Automatically Classify Keratoconus
    Hidalgo, Irene Ruiz
    Rodriguez Perez, Pablo
    Rozema, Jos J.
    Tassignon, Marie-Jose B. R.
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2014, 55 (13)
  • [9] Tree-based Machine Learning Methods for Survey Research
    Kern, Christoph
    Klausch, Thomas
    Kreuter, Frauke
    SURVEY RESEARCH METHODS, 2019, 13 (01): : 73 - 93
  • [10] Cosmic string detection with tree-based machine learning
    Sadr, A. Vafaei
    Farhang, M.
    Movahed, S. M. S.
    Bassett, B.
    Kunz, M.
    MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2018, 478 (01) : 1132 - 1140