Lookin' Out My Backdoor! Investigating Backdooring Attacks Against DL-driven Malware Detectors

被引:0
|
作者
D'Onghia, Mario [1 ]
Di Cesare, Federico [1 ]
Gallo, Luigi [2 ]
Carminati, Michele [1 ]
Polino, Mario [1 ]
Zanero, Stefano [1 ]
机构
[1] Politecn Milan, Milan, Italy
[2] TIM SpA Cyber Secur Lab, Turin, Italy
关键词
backdooring attacks; adversarial machine learning; evasion; deep learning; malware detection;
D O I
10.1145/3605764.3623919
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given their generalization capabilities, deep learning algorithms may represent a powerful weapon in the arsenal of antivirus developers. Nevertheless, recent works in different domains (e.g., computer vision) have shown that such algorithms are susceptible to backdooring attacks, namely training-time attacks that aim to teach a deep neural network to misclassify inputs containing a specific trigger. This work investigates the resilience of deep learning models for malware detection against backdooring attacks. In particular, we devise two classes of attacks for backdooring a malware detector that targets the update process of the underlying deep learning classifier. While the first and most straightforward approach relies on superficial triggers made of static byte sequences, the second attack we propose employs latent triggers, namely specific feature configurations in the latent space of the model. The latent triggers may be produced by different byte sequences in the binary inputs, rendering the trigger dynamic in the input space and thus more challenging to detect. We evaluate the resilience of two state-of-the-art convolutional neural networks for malware detection against both strategies and under different threat models. Our results indicate that the models do not easily learn superficial triggers in a clean label setting, even when allowing a high rate (>= 30%) of poisoning samples. Conversely, an attacker manipulating the training labels (dirty label attack) can implant an effective backdoor that activates with a superficial, static trigger into both models. The results obtained from the experimental evaluation carried out on the latent trigger attack instead show that the knowledge of the adversary on the target classifier may influence the success of the attack. Assuming perfect knowledge, an attacker can implant a backdoor that perfectly activates in 100% of the cases with a poisoning rate as low as 0.1% of the whole updating dataset (namely, 32 poisoning samples in a dataset of 32000 elements). Lastly, we experiment with two known defensive techniques that were shown effective against other backdooring attacks in the malware domain. However, none proved reliable in detecting the backdoor or triggered samples created by our latent space attack. We then discuss some modifications to those techniques that may render them effective against latent backdooring attacks.
引用
收藏
页码:209 / 220
页数:12
相关论文
共 2 条
  • [1] Countermeasures Against Backdoor Attacks Towards Malware Detectors
    Narisada, Shintaro
    Matsumoto, Yuki
    Hidano, Seira
    Uchibayashi, Toshihiro
    Suganuma, Takuo
    Hiji, Masahiro
    Kiyomoto, Shinsaku
    CRYPTOLOGY AND NETWORK SECURITY, CANS 2021, 2021, 13099 : 295 - 314
  • [2] Bias Busters: Robustifying DL-Based Lithographic Hotspot Detectors Against Backdooring Attacks
    Liu, Kang
    Tan, Benjamin
    Reddy, Gaurav Rajavendra
    Garg, Siddharth
    Makris, Yiorgos
    Karri, Ramesh
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2021, 40 (10) : 2077 - 2089