Detection of malicious java']javascript on an imbalanced dataset

被引:13
|
作者
Phung, Ngoc Minh [1 ]
Mimura, Mamoru [1 ]
机构
[1] Natl Def Acad, 1-10-20 Hashirimizu, Yokosuka, Kanagawa, Japan
关键词
Malicious [!text type='Java']Java[!/text]Script; Attention mechanism; Natural language processing; Oversampling; Machine learning;
D O I
10.1016/j.iot.2021.100357
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In order to be able to detect new malicious JavaScript with low cost, methods with machine learning techniques have been proposed and gave positive results. These methods focus on achieving a light-weight filtering model that can quickly and precisely filter out malicious data for dynamic analysis. A method constructs a language model using Natural Language Processing techniques to represent the data in vector form from the source code for machine learning. This method has high score with the balanced dataset, however the experiment with an imbalanced dataset has not been done. Previous studies mainly focus on a balanced dataset, however the dataset is not representative of real-world data, and it rises questions in practical uses of the model. A good model that can have a high recall score with imbalanced dataset is needed for a good filter. To construct an efficient language model, and to deal with the data imbalance problem, we focus on oversampling techniques. In our research, our method is the first to use oversampling and machine learning to detect malicious JavaScript. The experimental result shows that our method can detect new malicious JavaScript more accurately and efficiently. Our model can quickly filter out malicious data for dynamic analysis. The best recall score achieves 0.72 with the Doc2Vec model. Our proposed method is shown to outperform the baseline method by 210% in terms of recal score with the same training time and test time per sample. (C) 2021 Elsevier B.V. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Filtering Malicious Java']JavaScript Code with Doc2Vec on an Imbalanced Dataset
    Mimura, Mamoru
    Suga, Yuya
    [J]. 2019 14TH ASIA JOINT CONFERENCE ON INFORMATION SECURITY (ASIAJCIS 2019), 2019, : 24 - 31
  • [2] Malicious Java']JavaScript Detection by Features Extraction
    Canfora, Gerardo
    Mercaldo, Francesco
    Visaggio, Corrado Aaron
    [J]. E-INFORMATICA SOFTWARE ENGINEERING JOURNAL, 2014, 8 (01) : 65 - 78
  • [3] Detection of Obfuscated Malicious Java']JavaScript Code
    Alazab, Ammar
    Khraisat, Ansam
    Alazab, Moutaz
    Singh, Sarabjot
    [J]. FUTURE INTERNET, 2022, 14 (08):
  • [4] Obfuscated Malicious Java']JavaScript Detection by Machine Learning
    Pan, Jinkun
    Mao, Xiaoguang
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN MECHANICAL ENGINEERING AND INDUSTRIAL INFORMATICS (AMEII 2016), 2016, 73 : 805 - 810
  • [5] Malicious Java']JavaScript Detection Based on Bidirectional LSTM Model
    Song, Xuyan
    Chen, Chen
    Cui, Baojiang
    Fu, Junsong
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (10):
  • [6] Obfuscated Malicious Java']Javascript Detection using Classification Techniques
    Likarish, Peter
    Jung, Eunjin E. J.
    Jo, Insoon
    [J]. 2009 4TH INTERNATIONAL CONFERENCE ON MALICIOUS AND UNWANTED SOFTWARE (MALWARE 2009), 2009, : 47 - +
  • [7] Improving Detection Accuracy for Malicious Java']JavaScript Using GAN
    Guo, Junxia
    Cao, Qiyun
    Zhao, Rilian
    Li, Zheng
    [J]. WEB ENGINEERING, ICWE 2020, 2020, 12128 : 163 - 170
  • [8] Malicious Java']JavaScript Code Detection Based on Hybrid Analysis
    He, Xincheng
    Xu, Lei
    Cha, Chunliu
    [J]. 2018 25TH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE (APSEC 2018), 2018, : 365 - 374
  • [9] JAST: Fully Syntactic Detection of Malicious (Obfuscated) Java']JavaScript
    Fass, Aurore
    Krawczyk, Robert P.
    Backes, Michael
    Stock, Ben
    [J]. DETECTION OF INTRUSIONS AND MALWARE, AND VULNERABILITY ASSESSMENT, DIMVA 2018, 2018, 10885 : 303 - 325
  • [10] ScriptNet: Neural Static Analysis for Malicious Java']JavaScript Detection
    Stokes, Jack W.
    Agrawal, Rakshit
    McDonald, Geoff
    Hausknech, Matthew
    [J]. MILCOM 2019 - 2019 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2019,