Lookin' Out My Backdoor! Investigating Backdooring Attacks Against DL-driven Malware Detectors

被引：0

作者：

D'Onghia, Mario ^{[1
]}

Di Cesare, Federico ^{[1
]}

Gallo, Luigi ^{[2
]}

Carminati, Michele ^{[1
]}

Polino, Mario ^{[1
]}

Zanero, Stefano ^{[1
]}

机构：

[1] Politecn Milan, Milan, Italy

[2] TIM SpA Cyber Secur Lab, Turin, Italy

来源：

PROCEEDINGS OF THE 16TH ACM WORKSHOP ON ARTIFICIAL INTELLIGENCE AND SECURITY, AISEC 2023 | 2023年

关键词：

backdooring attacks; adversarial machine learning; evasion; deep learning; malware detection;

D O I：

10.1145/3605764.3623919

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Given their generalization capabilities, deep learning algorithms may represent a powerful weapon in the arsenal of antivirus developers. Nevertheless, recent works in different domains (e.g., computer vision) have shown that such algorithms are susceptible to backdooring attacks, namely training-time attacks that aim to teach a deep neural network to misclassify inputs containing a specific trigger. This work investigates the resilience of deep learning models for malware detection against backdooring attacks. In particular, we devise two classes of attacks for backdooring a malware detector that targets the update process of the underlying deep learning classifier. While the first and most straightforward approach relies on superficial triggers made of static byte sequences, the second attack we propose employs latent triggers, namely specific feature configurations in the latent space of the model. The latent triggers may be produced by different byte sequences in the binary inputs, rendering the trigger dynamic in the input space and thus more challenging to detect. We evaluate the resilience of two state-of-the-art convolutional neural networks for malware detection against both strategies and under different threat models. Our results indicate that the models do not easily learn superficial triggers in a clean label setting, even when allowing a high rate (>= 30%) of poisoning samples. Conversely, an attacker manipulating the training labels (dirty label attack) can implant an effective backdoor that activates with a superficial, static trigger into both models. The results obtained from the experimental evaluation carried out on the latent trigger attack instead show that the knowledge of the adversary on the target classifier may influence the success of the attack. Assuming perfect knowledge, an attacker can implant a backdoor that perfectly activates in 100% of the cases with a poisoning rate as low as 0.1% of the whole updating dataset (namely, 32 poisoning samples in a dataset of 32000 elements). Lastly, we experiment with two known defensive techniques that were shown effective against other backdooring attacks in the malware domain. However, none proved reliable in detecting the backdoor or triggered samples created by our latent space attack. We then discuss some modifications to those techniques that may render them effective against latent backdooring attacks.

引用

页码：209 / 220

页数：12

共 2 条

[1] Countermeasures Against Backdoor Attacks Towards Malware Detectors
Narisada, Shintaro
Matsumoto, Yuki
Hidano, Seira
Uchibayashi, Toshihiro
Suganuma, Takuo
Hiji, Masahiro
Kiyomoto, Shinsaku
CRYPTOLOGY AND NETWORK SECURITY, CANS 2021, 2021, 13099 : 295 - 314
[2] Bias Busters: Robustifying DL-Based Lithographic Hotspot Detectors Against Backdooring Attacks
Liu, Kang
Tan, Benjamin
Reddy, Gaurav Rajavendra
Garg, Siddharth
Makris, Yiorgos
Karri, Ramesh
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2021, 40 (10) : 2077 - 2089

← 1 →