Deobfuscation, unpacking, and decoding of obfuscated malicious Java']JavaScript for machine learning models detection performance improvement

被引:21
|
作者
Ndichu, Samuel [1 ]
Kim, Sangwook [1 ]
Ozawa, Seiichi [1 ,2 ]
机构
[1] Kobe Univ, Grad Sch Engn, Kobe, Hyogo, Japan
[2] Kobe Univ, Ctr Math & Data Sci, Kobe, Hyogo, Japan
关键词
D O I
10.1049/trit.2020.0026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Obfuscation is rampant in both benign and malicious JavaScript (JS) codes. It generates an obscure and undetectable code that hinders comprehension and analysis. Therefore, accurate detection of JS codes that masquerade as innocuous scripts is vital. The existing deobfuscation methods assume that a specific tool can recover an original JS code entirely. For a multi-layer obfuscation, general tools realize a formatted JS code, but some sections remain encoded. For the detection of such codes, this study performs Deobfuscation, Unpacking, and Decoding (DUD-preprocessing) by function redefinition using a Virtual Machine (VM), a JS code editor, and a python int_to_str() function to facilitate feature learning by the FastText model. The learned feature vectors are passed to a classifier model that judges the maliciousness of a JS code. In performance evaluation, the authors use the Hynek Petrak's dataset for obfuscated malicious JS codes and the SRILAB dataset and the Majestic Million service top 10,000 websites for obfuscated benign JS codes. They then compare the performance to other models on the detection of DUD-preprocessed obfuscated malicious JS codes. Their experimental results show that the proposed approach enhances feature learning and provides improved accuracy in the detection of obfuscated malicious JS codes.
引用
收藏
页码:184 / 192
页数:9
相关论文
共 40 条
  • [21] Explaining poor performance of text-based machine learning models for vulnerability detection
    Napier, Kollin
    Bhowmik, Tanmay
    Chen, Zhiqian
    [J]. EMPIRICAL SOFTWARE ENGINEERING, 2024, 29 (05)
  • [22] Performance Anomaly Detection Models of Virtual Machines for Network Function Virtualization Infrastructure with Machine Learning
    Qiu, Juan
    Du, Qingfeng
    He, Yu
    Lin, Yiqun
    Zhu, Jiaye
    Yin, Kanglin
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2018, PT II, 2018, 11140 : 479 - 488
  • [23] Performance Evaluation of Machine Learning Models for Cyber Threat Detection and Prevention in Mobile Money Services
    Akinyemi, Bodunde Odunola
    Olalere, Dauda Akinwuyi
    Sanni, Mistura Laide
    Olajubu, Emmanuel Ajayi
    Aderounmu, Ganiyu Adesola
    Ibrahim, Isa Ali
    [J]. Informatica (Slovenia), 2023, 47 (06): : 173 - 190
  • [24] Diagnostic performance of machine learning models using cell population data for the detection of sepsis: a comparative study
    Aguirre, Urko
    Urrechaga, Eloisa
    [J]. CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2023, 61 (02) : 356 - 365
  • [25] Studying Drowsiness Detection Performance While Driving Through Scalable Machine Learning Models Using Electroencephalography
    Rogel, Jose Manuel Hidalgo
    Beltran, Enrique Tomas Martinez
    Perez, Mario Quiles
    Bernal, Sergio Lopez
    Perez, Gregorio Martinez
    Celdran, Alberto Huertas
    [J]. COGNITIVE COMPUTATION, 2024, 16 (03) : 1253 - 1267
  • [26] Increasing the performance of intrusion detection models developed using machine learning method with preprocessing applied to the dataset
    Ilgun, Esen Gul
    Samet, Refik
    [J]. JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, 2024, 39 (02): : 679 - 692
  • [27] Impact of Feature Selection Techniques on the Performance of Machine Learning Models for Depression Detection Using EEG Data
    Hassan, Marwa
    Kaabouch, Naima
    [J]. Applied Sciences (Switzerland), 2024, 14 (22):
  • [28] On the Performance of Machine Learning Models for Anomaly-Based Intelligent Intrusion Detection Systems for the Internet of Things
    Abdelmoumin, Ghada
    Rawat, Danda B.
    Rahman, Abdul
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (06): : 4280 - 4290
  • [29] Improvement of Underground Cavity and Structure Detection Performance Through Machine Learning-based Diffraction Separation of GPR Data
    Kim, Sooyoon
    Byun, Joongmoo
    [J]. GEOPHYSICS AND GEOPHYSICAL EXPLORATION, 2023, 26 (04): : 171 - 184
  • [30] Performance variability of radiomics machine learning models for the detection of clinically significant prostate cancer in heterogeneous MRI datasets
    Gresser, Eva
    Schachtner, Balthasar
    Stueber, Anna Theresa
    Solyanik, Olga
    Schreier, Andrea
    Huber, Thomas
    Froelich, Matthias Frank
    Magistro, Giuseppe
    Kretschmer, Alexander
    Stief, Christian
    Ricke, Jens
    Ingrisch, Michael
    Noerenberg, Dominik
    [J]. QUANTITATIVE IMAGING IN MEDICINE AND SURGERY, 2022, 12 (11) : 4990 - +