Transferability of features for neural networks links to adversarial attacks and defences

被引：2

作者：

Kotyan, Shashank ^{[1
]}

Matsuki, Moe ^{[2
]}

Vargas, Danilo Vasconcellos ^{[1
,3
]}

机构：

[1] Kyushu Univ, Dept Informat Sci & Engn, Fukuoka, Japan

[2] SoftBank Grp Corp, Tokyo, Japan

[3] Univ Tokyo, Sch Engn, Dept Elect Engn & Informat Syst, Tokyo, Japan

来源：

PLOS ONE | 2022年 / 17卷 / 04期

关键词：

D O I：

10.1371/journal.pone.0266060

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

The reason for the existence of adversarial samples is still barely understood. Here, we explore the transferability of learned features to Out-of-Distribution (OoD) classes. We do this by assessing neural networks' capability to encode the existing features, revealing an intriguing connection with adversarial attacks and defences. The principal idea is that, "if an algorithm learns rich features, such features should represent Out-of-Distribution classes as a combination of previously learned In-Distribution (ID) classes". This is because OoD classes usually share several regular features with ID classes, given that the features learned are general enough. We further introduce two metrics to assess the transferred features representing OoD classes. One is based on inter-cluster validation techniques, while the other captures the influence of a class over learned features. Experiments suggest that several adversarial defences decrease the attack accuracy of some attacks and improve the transferability-of-features as measured by our metrics. Experiments also reveal a relationship between the proposed metrics and adversarial attacks (a high Pearson correlation coefficient and low p-value). Further, statistical tests suggest that several adversarial defences, in general, significantly improve transferability. Our tests suggests that models having a higher transferability-of-features have generally higher robustness against adversarial attacks. Thus, the experiments suggest that the objectives of adversarial machine learning might be much closer to domain transfer learning, as previously thought.

引用

页数：19

共 50 条

[1] Robustness and Transferability of Adversarial Attacks on Different Image Classification Neural Networks
Smagulova, Kamilya
Bacha, Lina
Fouda, Mohammed E.
Kanj, Rouwaida
Eltawil, Ahmed
[J]. ELECTRONICS, 2024, 13 (03)
[2] Demystifying the Transferability of Adversarial Attacks in Computer Networks
Nowroozi, Ehsan
Mekdad, Yassine
Berenjestanaki, Mohammad Hajian
Conti, Mauro
El Fergougui, Abdeslam
[J]. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2022, 19 (03): : 3387 - 3400
[3] On the Transferability of Adversarial Attacks against Neural Text Classifier
Yuan, Liping
Zheng, Xiaoqing
Zhou, Yi
Hsieh, Cho-Jui
Chang, Kai-Wei
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 1612 - 1625
[4] A survey on adversarial attacks and defences
Chakraborty, Anirban
Alam, Manaar
Dey, Vishal
Chattopadhyay, Anupam
Mukhopadhyay, Debdeep
[J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2021, 6 (01) : 25 - 45
[5] Disrupting adversarial transferability in deep neural networks
Wiedeman, Christopher
Wang, Ge
[J]. PATTERNS, 2022, 3 (05):
[6] Exploring Transferability on Adversarial Attacks
Alvarez, Enrique
Alvarez, Rafael
Cazorla, Miguel
[J]. IEEE ACCESS, 2023, 11 : 105545 - 105556
[7] Unscrambling the Rectification of Adversarial Attacks Transferability across Computer Networks
Nowroozi, Ehsan
Ghelichkhani, Samaneh
Haider, Imran
Dehghantanha, Ali
[J]. arXiv, 2023,
[8] Properties that allow or prohibit transferability of adversarial attacks among quantized networks
Shrestha, Abhishek
Grossmann, Juergen
[J]. PROCEEDINGS OF THE 2024 IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATION OF SOFTWARE TEST, AST 2024, 2024, : 99 - 109
[9] Boosting the Transferability of Adversarial Attacks with Reverse Adversarial Perturbation
Qin, Zeyu
Fan, Yanbo
Liu, Yi
Shen, Li
Zhang, Yong
Wang, Jue
Wu, Baoyuan
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[10] Admix: Enhancing the Transferability of Adversarial Attacks
Wang, Xiaosen
He, Xuanran
Wang, Jingdong
He, Kun
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 16138 - 16147

← 1 2 3 4 5 →