Detection of Prevalent Malware Families with Deep Learning

被引:0
|
作者
Stokes, Jack W. [1 ]
Seifert, Christian [2 ]
Li, Jerry [3 ]
Hejazi, Nizar [4 ]
机构
[1] Microsoft Res, Redmond, WA 98052 USA
[2] Microsoft Corp, Redmond, WA 98052 USA
[3] Snap Corp, Seattle, WA 98101 USA
[4] Uber Corp, Seattle, WA 98101 USA
关键词
Malware Detection; Siamese Neural Network;
D O I
10.1109/milcom47813.2019.9020790
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Attackers evolve their malware over time in order to evade detection, and the rate of change varies from family to family depending on the amount of resources these groups devote to their "product". This rapid change forces anti-malware companies to also direct much human and automated effort towards combatting these threats. These companies track thousands of distinct malware families and their variants, but the most prevalent families are often particularly problematic. While some companies employ many analysts to investigate and create new signatures for these highly prevalent families, we take a different approach and propose a new deep learning system to learn a semantic feature embedding which better discriminates the files within each of these families. Identifying files which are close in a metric space is the key aspect of malware clustering systems. The DeepSim system employs a Siamese Neural Network (SNN), which has previously shown promising results in other domains, to learn this embedding for the cosine distance in the feature space. The error rate for K-Nearest Neighbor classification using DeepSim's SNN with two hidden layers is 0.011% compared to 0.42% for a Jaccard Index-based baseline which has been used by several previously proposed systems to identify similar malware files.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] Deep Learning and Visualization for Identifying Malware Families
    Sun, Guosong
    Qian, Quan
    [J]. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2021, 18 (01) : 283 - 295
  • [2] DeepOrigin: End-to-End Deep Learning for Detection of New Malware Families
    Cordonsky, Ilay
    Rosenberg, Ishai
    Sicard, Guillaume
    David, Eli
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,
  • [3] Malware Detection using Malware Image and Deep Learning
    Choi, Sunoh
    Jang, Sungwook
    Kim, Youngsoo
    Kim, Jonghyun
    [J]. 2017 INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC), 2017, : 1193 - 1195
  • [4] Malware Detection with Malware Images using Deep Learning Techniques
    He, Ke
    Kim, Dong Seong
    [J]. 2019 18TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS/13TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (TRUSTCOM/BIGDATASE 2019), 2019, : 95 - 102
  • [5] Malware detection based on deep learning algorithm
    Ding Yuxin
    Zhu Siyi
    [J]. NEURAL COMPUTING & APPLICATIONS, 2019, 31 (02): : 461 - 472
  • [6] Android Malware Detection Using Deep Learning
    Elayan, Omar N.
    Mustafa, Ahmad M.
    [J]. 12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 : 847 - 852
  • [7] Malware detection based on deep learning algorithm
    Ding Yuxin
    Zhu Siyi
    [J]. Neural Computing and Applications, 2019, 31 : 461 - 472
  • [8] A survey of malware detection using deep learning
    Bensaoud, Ahmed
    Kalita, Jugal
    Bensaoud, Mahmoud
    [J]. MACHINE LEARNING WITH APPLICATIONS, 2024, 16
  • [9] Malware Detection Techniques Based on Deep Learning
    Sreekumari, Prasanthi
    [J]. 2020 IEEE 6TH INT CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / 6TH IEEE INT CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, (HPSC) / 5TH IEEE INT CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2020, : 65 - 70
  • [10] Trend of Malware Detection Using Deep Learning
    Lee, Yoon-seon
    Lee, Jae-ung
    Soh, Woo-young
    [J]. ICEMT 2018: PROCEEDINGS OF THE 2018 2ND INTERNATIONAL CONFERENCE ON EDUCATION AND MULTIMEDIA TECHNOLOGY, 2018, : 102 - 106