SoK: Explainable Machine Learning in Adversarial Environments

被引：1

作者：

Noppel, Maximilian ^{[1
]}

Wressnegger, Christian ^{[1
]}

机构：

[1] Karlsruhe Inst Technol, KASTEL Secur Res Labs, Karlsruhe, Germany

来源：

45TH IEEE SYMPOSIUM ON SECURITY AND PRIVACY, SP 2024 | 2024年

关键词：

Explainable Machine Learning; XAI; Attacks; Defenses; Robustness Notions; CLASSIFICATION; EXPLANATIONS; DECISIONS; ATTACKS;

D O I：

10.1109/SP54263.2024.00021

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Modern deep learning methods have long been considered black boxes due to the lack of insights into their decision-making process. However, recent advances in explainable machine learning have turned the tables. Post-hoc explanation methods enable precise relevance attribution of input features for otherwise opaque models such as deep neural networks. This progression has raised expectations that these techniques can uncover attacks against learning-based systems such as adversarial examples or neural backdoors. Unfortunately, current methods are not robust against manipulations themselves. In this paper, we set out to systematize attacks against post-hoc explanation methods to lay the groundwork for developing more robust explainable machine learning. If explanation methods cannot be misled by an adversary, they can serve as an effective tool against attacks, marking a turning point in adversarial machine learning. We present a hierarchy of explanation-aware robustness notions and relate existing defenses to it. In doing so, we uncover synergies, research gaps, and future directions toward more reliable explanations robust against manipulations.

引用

页码：2441 / 2459

页数：19

共 50 条

[31] Explainable machine learning models with privacy
Aso Bozorgpanah
Vicenç Torra
Progress in Artificial Intelligence, 2024, 13 : 31 - 50
[32] Hardware Acceleration of Explainable Machine Learning
Pan, Zhixin
Mishra, Prabhat
PROCEEDINGS OF THE 2022 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2022), 2022, : 1127 - 1130
[33] eXplainable Cooperative Machine Learning with NOVA
Baur, Tobias
Heimerl, Alexander
Lingenfelser, Florian
Wagner, Johannes
Valstar, Michel F.
Schuller, Björn
André, Elisabeth
KI - Kunstliche Intelligenz, 2020, 34 (02): : 143 - 164
[34] Explainable Machine Learning for Intrusion Detection
Bellegdi, Sameh
Selamat, Ali
Olatunji, Sunday O.
Fujita, Hamido
Krejcar, Ondfrej
ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE: THEORY AND APPLICATIONS, IEA-AIE 2024, 2024, 14748 : 122 - 134
[35] Explainable Artificial Intelligence and Machine Learning
Raunak, M. S.
Kuhn, Rick
COMPUTER, 2021, 54 (10) : 25 - 27
[36] Explainable machine learning in cybersecurity: A survey
Yan, Feixue
Wen, Sheng
Nepal, Surya
Paris, Cecile
Xiang, Yang
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (12) : 12305 - 12334
[37] Machine Learning in Adversarial Settings
McDaniel, Patrick
Papernot, Nicolas
Celik, Z. Berkay
IEEE SECURITY & PRIVACY, 2016, 14 (03) : 68 - 72
[38] Quantum adversarial machine learning
Lu, Sirui
Duan, Lu-Ming
Deng, Dong-Ling
PHYSICAL REVIEW RESEARCH, 2020, 2 (03):
[39] Adversarial Machine Learning for Text
Lee, Daniel
Verma, Rakesh
PROCEEDINGS OF THE SIXTH INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS (IWSPA'20), 2020, : 33 - 34
[40] On the Economics of Adversarial Machine Learning
Merkle, Florian
Samsinger, Maximilian
Schottle, Pascal
Pevny, Tomas
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2024, 19 : 4670 - 4685

← 1 2 3 4 5 →