CADefender: Detection of unknown malicious AutoLISP computer-aided design files using designated feature extraction and machine learning methods

被引:0
|
作者
Yevsikov, Alexander [1 ,2 ]
Muralidharan, Trivikram [1 ,2 ]
Panker, Tomer [1 ,2 ]
Nissim, Nir [1 ,2 ]
机构
[1] Ben Gurion Univ Negev, Cyber Secur Res Ctr, Malware Lab, IL-8470912 Beer Sheva, Israel
[2] Ben Gurion Univ Negev, Dept Ind Engn & Management, IL-8410501 Beer Sheva, Israel
关键词
Computer-aided design; Auto list processing; Machine learning; Malware detection; Feature extraction; MALWARE DETECTION; CLASSIFICATION;
D O I
10.1016/j.engappai.2024.109414
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Computer-aided design (CAD) files are used to create digital designs for various structures - from the smallest chips in the high-tech industry to large-scale buildings and bridges in the civil engineering space. We found that most exploits and malicious payloads are deployed through Auto List Processing (AutoLISP) source code (LSP) or Fast Load AutoLISP (FAS) files, which are non-executable files (NEFs) containing scripts in the AutoLISP language that are native to AutoCAD; While antivirus software is capable of detecting many malicious CAD files, the potential to improve protection by using a dedicated machine learning (ML) based detection solution remains, especially against unknown and sophisticated CAD malware. In this study, we are the first to propose designated feature extraction methods and a robust framework aimed at the detection of known and unknown AutoLISP malware using ML algorithms. To accomplish this, we examined the structure, functionality, and ecosystems of AutoLISP files and collected the largest known representative collection of LSP files consisting of 6418 malicious and benign files (labeled and verified). We then explored the use of two novel static-analysis-based feature extraction methods (knowledge-based and structural) designated for LSP files to extract a discriminative set of informative features, which can subsequently be used by ML models to detect malicious LSP files. These two feature extraction methods serve as the basis of the proposed detection framework, whose performance we comprehensively compare to both widely used antiviruses and baseline ML models based on existing feature extraction methods, including MinHash, Bidirectional Encoder Representations from Transformers (BERT), and n-gram. Our results highlight our methods' contributions to the detection of unknown AutoLISP malware and demonstrate their ability to outperform existing methods. The best performance in the task of unknown malicious LSP file detection was obtained by the Artificial Neural Networks (ANN) model trained on 100 knowledgebased features, which obtained a true positive rate (TPR) of 99.49% with a false positive rate (FPR) of 0.57%. Our framework's role in explainability is also highlighted, as we also present the prominent features that contribute most to the model's detection capabilities; this information can be used for explainability purposes. We conclude by evaluating the proposed framework's ability to detect a malicious file from an unknown AutoLISP malware family and by evaluating our framework on an additional independent test set that originated from another source, scenarios that are often faced by malware detection solutions.
引用
收藏
页数:25
相关论文
共 50 条
  • [41] Computer-aided cluster formation in wireless sensor networks using machine learning
    Thangaraj, K.
    Sakthivel, M.
    Balasamy, K.
    Suganyadevi, S.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (05) : 7415 - 7428
  • [42] Machine tool drivetrain modelling using computer-aided control system design
    Ebrahimi, M
    INTERNATIONAL JOURNAL OF COMPUTER APPLICATIONS IN TECHNOLOGY, 2000, 13 (3-5) : 221 - 228
  • [43] COMPUTER-AIDED DIAGNOSIS FOR BREAST ULTRASOUND USING COMPUTERIZED BI-RADS FEATURES AND MACHINE LEARNING METHODS
    Shan, Juan
    Alam, S. Kaisar
    Garra, Brian
    Zhang, Yingtao
    Ahmed, Tahira
    ULTRASOUND IN MEDICINE AND BIOLOGY, 2016, 42 (04): : 980 - 988
  • [44] A framework for computer-aided high performance titanium alloy design based on machine learning
    An, Suyang
    Li, Kun
    Zhu, Liang
    Liang, Haisong
    Ma, Ruijin
    Liao, Ruobing
    Murr, Lawrence E.
    FRONTIERS IN MATERIALS, 2024, 11
  • [45] A machine learning based computer-aided molecular design/screening methodology for fragrance molecules
    Zhang, Lei
    Mao, Haitao
    Liu, Linlin
    Du, Jian
    Gani, Rafiqul
    COMPUTERS & CHEMICAL ENGINEERING, 2018, 115 : 295 - 308
  • [46] Correlation-based feature extraction from computer-aided design, case study on curtain airbags design
    Mohammad, Arjomandi Rad
    Salomonsson, Kent
    Cenanovic, Mirza
    Balague, Henrik
    Raudberget, Dag
    Stolt, Roland
    COMPUTERS IN INDUSTRY, 2022, 138
  • [47] Neonates Crying Detection Through Feature Extraction and Machine Learning Methods
    Nunez-Calvo, Lucia
    Velasco-Perez, Nuria
    Lozano-Juarez, Samuel
    Herrero, Alvaro
    Arnaez, Juan
    Urda, Daniel
    HYBRID ARTIFICIAL INTELLIGENT SYSTEM, PT I, HAIS 2024, 2025, 14857 : 275 - 285
  • [48] Computer-aided design of CFC and HCFC substitutes using group contribution methods
    Khetib, Y.
    Meniai, A. -H.
    Lallemand, A.
    DESALINATION, 2009, 239 (1-3) : 82 - 91
  • [49] Microstructure Feature Recognition for Materials Using Surfacelet-Based Methods for Computer-Aided Design-Material Integration
    Jeong, Namin
    Rosen, David W.
    JOURNAL OF MANUFACTURING SCIENCE AND ENGINEERING-TRANSACTIONS OF THE ASME, 2014, 136 (06):
  • [50] Computer-aided diagnosis of Haematologic disorders detection based on spatial feature learning networks using blood cell images
    Jamal Alsamri
    Hamed Alqahtani
    Ali M. Al-Sharafi
    Abdulbasit A. Darem
    Khalid Nazim
    Abdul Sattar
    Menwa Alshammeri
    Ahmad A. Alzahrani
    Marwa Obayya
    Scientific Reports, 15 (1)