Interpretable adversarial example detection via high-level concept activation vector

被引:0
|
作者
Li, Jiaxing [1 ]
Tan, Yu-an [1 ]
Liu, Xinyu [1 ]
Meng, Weizhi [2 ]
Li, Yuanzhang [3 ]
机构
[1] Beijing Inst Technol, Sch Cyberspace Sci & Technol, Beijing 100081, Peoples R China
[2] Univ Lancaster, Sch Comp & Commun, Lancaster LA1 4YR, England
[3] Beijing Inst Technol, Sch Comp Sci Technol, Beijing 100081, Peoples R China
关键词
Deep learning; Adversarial machine learning; Model explainability; Adversarial defense; Concept activation vector;
D O I
10.1016/j.cose.2024.104218
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks have achieved amazing performance in many tasks. However, they are easily fooled by small perturbations added to the input. Such small perturbations to image data are usually imperceptible to humans. The uninterpretable nature of deep learning systems is considered to be one of the reasons why they are vulnerable to adversarial attacks. For enhanced trust and confidence, it is crucial for artificial intelligence systems to ensure transparency, reliability, and human comprehensibility in their decision-making processes as they gain wider acceptance among the general public. In this paper, we propose an approach for defending against adversarial attacks based on conceptually interpretable techniques. Our approach to model interpretation is on high-level concepts rather than low-level pixel features. Our key finding is that adding small perturbations leads to large changes in the model concept vector tests. Based on this, we design a single image concept vector testing method for detecting adversarial examples. Our experiments on the Imagenet dataset show that our method can achieve an average accuracy of over 95%. We provide source code in the supplementary material.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Interpretable Music Categorisation Based on Fuzzy Rules and High-Level Audio Features
    Vatolkin, Igor
    Rudolph, Guenter
    DATA SCIENCE, LEARNING BY LATENT STRUCTURES, AND KNOWLEDGE DISCOVERY, 2015, : 423 - 432
  • [22] Adversarial Examples by Perturbing High-level Features in Intermediate Decoder Layers
    Cermak, Vojtech
    Adam, Lukas
    ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 2, 2022, : 496 - 507
  • [23] Image saliency detection via graph representation with fusing low-level and high-level features
    Gao, Sihan
    Zhang, Lei
    Li, Chenglong
    Tang, Jin
    Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics, 2016, 28 (03): : 420 - 426
  • [24] Methodology of high-level transaction level modeling using 802.11 PHY example
    Lee, J
    Park, SC
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (07): : 1749 - 1753
  • [25] Acceleration of Microwave Imaging Algorithms for Breast Cancer Detection via High-Level Synthesis
    Pagliari, Daniele Jahier
    Casu, Mario R.
    Carloni, Luca P.
    2015 33RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD), 2015, : 475 - 478
  • [26] A Framework for High-level Event Detection in a Social Network Context Via an Extension of ISEQL
    Persia, Fabio
    Helmer, Sven
    2018 IEEE 12TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC), 2018, : 140 - 147
  • [27] Efficient adversarial debiasing with concept activation vector - Medical image case-studies
    Correa, Ramon
    Pahwa, Khushbu
    Patel, Bhavik
    Vachon, Celine M.
    Gichoya, Judy W.
    Banerjee, Imon
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 149
  • [28] A Visualization Concept for High-Level Comparison of Process Model Versions
    Kriglstein, Simone
    Rinderle-Ma, Stefanie
    BUSINESS PROCESS MANAGEMENT WORKSHOPS (BPM), 2013, 132 : 465 - 476
  • [29] Concept Cells through Associative Learning of High-Level Representations
    Reddy, Leila
    Thorpe, Simon J.
    NEURON, 2014, 84 (02) : 248 - 251
  • [30] High-level expressing YAC vector for transgenic animal bioreactors
    Fujiwara, Y
    Miwa, M
    Takahashi, RI
    Kodaira, K
    Hirabayashi, M
    Suzuki, T
    Ueda, M
    MOLECULAR REPRODUCTION AND DEVELOPMENT, 1999, 52 (04) : 414 - 420