Learning from Context: A Multi-View Deep Learning Architecture for Malware Detection

被引:10
|
作者
Kyadige, Adarsh [1 ]
Rudd, Ethan M. [1 ]
Berlin, Konstantin [1 ]
机构
[1] Sophos AI, Reston, VA 20190 USA
关键词
Static PE Detection; File Path; Deep Learning; Multi-View Learning; Model Interpretation;
D O I
10.1109/SPW50608.2020.00018
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine learning (ML) classifiers used for malware detection typically employ numerical representations of the content of each file when making malicious/benign determinations. However, there is also relevant information that can be gleaned from the context in which the file was seen which is often ignored. One source of contextual information is the file's location on disk. For example, a malicious file masquerading as a known benign file (e.g., a Windows system DLL) is more likely to appear suspicious if the detector can intelligibly utilize information about the path at which it resides. Knowledge of the file path information could also make it easier to detect files which try to evade disk scans by placing themselves in specific locations. File paths are also available with little overhead and can seamlessly be integrated into a multi-view static ML detector, potentially yielding higher detection rates at very high throughput and minimal infrastructural changes. In this work, we propose a multi-view deep neural network architecture, which takes feature vectors from the PE file content as well as corresponding file paths as inputs and outputs a detection score. We perform an evaluation on a commercial-scale dataset of approximately 10 million samples - files and file paths from user endpoints serviced by an actual security vendor. We then conduct an interpretability analysis via LIME modeling to ensure that our classifier has learned a sensible representation and examine how the file path contributes to change in the classifier's score in different cases. We find that our model learns useful aspects of the file path for classification, resulting in a 26.6% improvement in the true positive rate at a 0.001 false positive rate (FPR) and a 64.6% improvement at 0.0001 FPR, compared to a model that operates on PE file content only.
引用
收藏
页码:1 / 7
页数:7
相关论文
共 50 条
  • [1] Multi-View Learning for Repackaged Malware Detection
    Singh, Shirish
    Chaturvedy, Kushagra
    Mishra, Bharavi
    [J]. ARES 2021: 16TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY, 2021,
  • [2] Multi-view deep learning for zero-day Android malware detection
    Millar, Stuart
    McLaughlin, Niall
    del Rincon, Jesus Martinez
    Miller, Paul
    [J]. JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2021, 58
  • [3] Improving malware detection using multi-view ensemble learning
    Bai, Jinrong
    Wang, Junfeng
    [J]. SECURITY AND COMMUNICATION NETWORKS, 2016, 9 (17) : 4227 - 4241
  • [4] Towards Multi-view Android Malware Detection Through Image-based Deep Learning
    Geremias, Jhonatan
    Viegas, Eduardo K.
    Santin, Altair O.
    Britto, Alceu
    Horchulhack, Pedro
    [J]. 2022 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2022, : 572 - 577
  • [5] Multi-View Object Detection Based on Deep Learning
    Tang, Cong
    Ling, Yongshun
    Yang, Xing
    Jin, Wei
    Zheng, Chao
    [J]. APPLIED SCIENCES-BASEL, 2018, 8 (09):
  • [6] A MULTI-VIEW DEEP LEARNING ARCHITECTURE FOR CLASSIFICATION OF BREAST MICROCALCIFICATIONS
    Bekker, Alan Joseph
    Greenspan, Hayit
    Goldberger, Jacob
    [J]. 2016 IEEE 13TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 2016, : 726 - 730
  • [7] A Multi-View attention-based deep learning framework for malware detection in smart healthcare systems
    Ravi, Vinayakumar
    Alazab, Mamoun
    Selvaganapathy, Shymalagowri
    Chaganti, Rajasekhar
    [J]. COMPUTER COMMUNICATIONS, 2022, 195 : 73 - 81
  • [8] Unbalanced Multi-view Deep Learning
    Xu, Cai
    Li, Zehui
    Guan, Ziyu
    Zhao, Wei
    Song, Xiangyu
    Wu, Yue
    Li, Jianxin
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3051 - 3059
  • [9] Deep Multi-View Concept Learning
    Xu, Cai
    Guan, Ziyu
    Zhao, Wei
    Niu, Yunfei
    Wang, Quan
    Wang, Zhiheng
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2898 - 2904
  • [10] Deep Multi-View Learning to Rank
    Cao, Guanqun
    Iosifidis, Alexandros
    Gabbouj, Moncef
    Raghavan, Vijay
    Gottumukkala, Raju
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (04) : 1426 - 1438