Automatically Detect Software Security Vulnerabilities Based on Natural Language Processing Techniques and Machine Learning Algorithms

被引:9
|
作者
Cho Do Xuan [1 ]
Vu Ngoc Son [2 ]
Duong Duc [2 ]
机构
[1] Posts & Telecommun Inst Technol, Fac Informat Assurance, Hanoi, Vietnam
[2] FPT Univ, Informat Assurance Dept, Hanoi, Vietnam
关键词
machine learning algorithms; natural language processing techniques; software security vulnerability detection; software vulnerabilities; source code features;
D O I
10.5614/itbj.ict.res.appl.2022.16.1.5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, software vulnerabilities pose a serious problem, because cyber-attackers often find ways to attack a system by exploiting software vulnerabilities. Detecting software vulnerabilities can be done using two main methods: i) signature-based detection, i.e. methods based on a list of known security vulnerabilities as a basis for contrasting and comparing; ii) behavior analysis-based detection using classification algorithms, i.e., methods based on analyzing the software code. In order to improve the ability to accurately detect software security vulnerabilities, this study proposes a new approach based on a technique of analyzing and standardizing software code and the random forest (RF) classification algorithm. The novelty and advantages of our proposed method are that to determine abnormal behavior of functions in the software, instead of trying to define behaviors of functions, this study uses the Word2vec natural language processing model to normalize and extract features of functions. Finally, to detect security vulnerabilities in the functions, this study proposes to use a popular and effective supervised machine learning algorithm.
引用
收藏
页码:70 / 88
页数:19
相关论文
共 50 条
  • [1] Software security with natural language processing and vulnerability scoring using machine learning approach
    Verma B.K.
    Yadav A.K.
    Journal of Ambient Intelligence and Humanized Computing, 2024, 15 (04) : 2641 - 2651
  • [2] A tree-based machine learning methodology to automatically classify software vulnerabilities
    Aivatoglou, Georgios
    Anastasiadis, Mike
    Spanos, Georgios
    Voulgaridis, Antonis
    Votis, Konstantinos
    Tzovaras, Dimitrios
    PROCEEDINGS OF THE 2021 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE (IEEE CSR), 2021, : 312 - 317
  • [3] Assessing the Efficacy and Influence of Learning Objectives via Natural Language Processing Algorithms and Machine Learning Techniques
    Chandrakant, Soni Maitrik
    2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,
  • [4] Applying machine learning and natural language processing to detect phishing email
    Alhogail, Areej
    Alsabih, Afrah
    COMPUTERS & SECURITY, 2021, 110
  • [5] Analysis of Software Vulnerabilities Using Machine Learning Techniques
    Diako, Doffou Jerome
    Achiepo, Odilon Yapo M.
    Mensah, Edoete Patrice
    E-INFRASTRUCTURE AND E-SERVICES FOR DEVELOPING COUNTRIES (AFRICOMM 2019), 2020, 311 : 30 - 37
  • [6] Machine Learning to Combine Static Analysis Alerts with Software Metrics to Detect Security Vulnerabilities: An Empirical Study
    Pereira, Jose D'Abruzzo
    Campos, Joao R.
    Vieira, Marco
    2021 17TH EUROPEAN DEPENDABLE COMPUTING CONFERENCE (EDCC 2021), 2021, : 1 - 8
  • [7] Cyberbullying Detection: Hybrid Models Based on Machine Learning and Natural Language Processing Techniques
    Raj, Chahat
    Agarwal, Ayush
    Bharathy, Gnana
    Narayan, Bhuva
    Prasad, Mukesh
    ELECTRONICS, 2021, 10 (22)
  • [8] Characterizing and Understanding Software Security Vulnerabilities in Machine Learning Libraries
    Harzevili, Nima Shiri
    Shin, Jiho
    Wang, Junjie
    Wang, Song
    Nagappan, Nachiappan
    2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2023, : 27 - 38
  • [9] Machine Learning and Natural Language Processing for Automating Software Testing (Tutorial)
    Pezze, Mauro
    PROCEEDINGS OF THE 30TH ACM JOINT MEETING EUROPEAN SOFTWARE ENGINEERING CONFERENCE AND SYMPOSIUM ON THE FOUNDATIONS OF SOFTWARE ENGINEERING, ESEC/FSE 2022, 2022, : 1821 - 1821
  • [10] Using Natural Language Processing and Machine Learning to Detect Online Grooming Attacks
    Street, Jake
    Olajide, Funminiyi
    ADVANCES IN COMPUTATIONAL INTELLIGENCE SYSTEMS, UKCI 2022, 2024, 1454 : 261 - 270