Toward Robust Classifiers for PDF Malware Detection

被引:1
|
作者
Albahar, Marwan [1 ]
Thanoon, Mohammed [1 ]
Alzilai, Monaj [1 ]
Alrehily, Alaa [1 ]
Alfaar, Munirah [1 ]
Algamdi, Maimoona [1 ]
Alassaf, Norah [1 ]
机构
[1] Umm Al Qura Univ, Coll Comp Al Leith, Mecca, Saudi Arabia
来源
CMC-COMPUTERS MATERIALS & CONTINUA | 2021年 / 69卷 / 02期
关键词
Malicious PDF classification; robustness; guiding principles; convolutional neural network; new regularization;
D O I
10.32604/cmc.2021.018260
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Malicious Portable Document Format (PDF) files represent one of the largest threats in the computer security space. Significant research has been done using handwritten signatures and machine learning based on detection via manual feature extraction. These approaches are time consuming, require substantial prior knowledge, and the list of features must be updated with each newly discovered vulnerability individually. In this study, we propose two models for PDF malware detection. The first model is a convolutional neural network (CNN) integrated into a standard deviation based regularization model to detect malicious PDF documents. The second model is a support vector machine (SVM) based ensemble model with three different kernels. The two models were trained and tested on two different datasets. The experimental results show that the accuracy of both models is approximately 100%, and the robustness against evasive samples is excellent. Further, the robustness of the models was evaluated with malicious PDF documents generated using Mimicus. Both models can distinguish the different vulnerabilities exploited in malicious files and achieve excellent performance in terms of generalization ability, accuracy, and robustness.
引用
收藏
页码:2181 / 2202
页数:22
相关论文
共 50 条
  • [1] On Training Robust PDF Malware Classifiers
    Chen, Yizheng
    Wang, Shiqi
    She, Dongdong
    Jana, Suman
    [J]. PROCEEDINGS OF THE 29TH USENIX SECURITY SYMPOSIUM, 2020, : 2343 - 2360
  • [2] Towards Robust Detection of PDF-based Malware
    Tay, Kai Yuan
    Chua, Shawn
    Chua, Melissa
    Balachandran, Vivek
    [J]. CODASPY'22: PROCEEDINGS OF THE TWELVETH ACM CONFERENCE ON DATA AND APPLICATION SECURITY AND PRIVACY, 2022, : 370 - 372
  • [3] Automatically Evading Classifiers A Case Study on PDF Malware Classifiers
    Xu, Weilin
    Qi, Yanjun
    Evans, David
    [J]. 23RD ANNUAL NETWORK AND DISTRIBUTED SYSTEM SECURITY SYMPOSIUM (NDSS 2016), 2016,
  • [4] Robust PDF Malware Detection with Image Visualization and Processing Techniques
    Corum, Andrew
    Jenkins, Donovan
    Zheng, Jun
    [J]. 2019 2ND INTERNATIONAL CONFERENCE ON DATA INTELLIGENCE AND SECURITY (ICDIS 2019), 2019, : 108 - 114
  • [5] Evading PDF Malware Classifiers with Generative Adversarial Network
    Wang, Yaxiao
    Li, Yuanzhang
    Zhang, Quanxin
    Hu, Jingjing
    Kuang, Xiaohui
    [J]. CYBERSPACE SAFETY AND SECURITY, PT I, 2020, 11982 : 374 - 387
  • [6] Unacceptable Behavior: Robust PDF Malware Detection Using Abstract Interpretation
    Jordan, Alexander
    Gauthier, Francois
    Hassanshahi, Behnaz
    Zhao, David
    [J]. PROCEEDINGS OF THE 14TH ACM SIGSAC WORKSHOP ON PROGRAMMING LANGUAGES AND ANALYSIS FOR SECURITY (PLAS '19), 2019, : 19 - 30
  • [7] PDF Malware Detection: Toward Machine Learning Modeling With Explainability Analysis
    Hossain, G. M. Sakhawat
    Deb, Kaushik
    Janicke, Helge
    Sarker, Iqbal H.
    [J]. IEEE ACCESS, 2024, 12 : 13833 - 13859
  • [8] EvadeRL: Evading PDF Malware Classifiers with Deep Reinforcement Learning
    Mao, Zhengyang
    Fang, Zhiyang
    Li, Meijin
    Fan, Yang
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [9] On Robust Malware Classifiers by Verifying Unwanted Behaviours
    Chen, Wei
    Aspinall, David
    Gordon, Andrew D.
    Sutton, Charles
    Muttik, Igor
    [J]. INTEGRATED FORMAL METHODS (IFM 2016), 2016, 9681 : 326 - 341
  • [10] Malware Detection in PDF and Office Documents: A survey
    Singh, Priyansh
    Tapaswi, Shashikala
    Gupta, Sanchit
    [J]. INFORMATION SECURITY JOURNAL, 2020, 29 (03): : 134 - 153