Provable Unrestricted Adversarial Training Without Compromise With Generalizability

被引:1
|
作者
Zhang, Lilin [1 ]
Yang, Ning [1 ]
Sun, Yanchao [2 ]
Yu, Philip S. [3 ]
机构
[1] Sichuan Univ, Sch Comp Sci, Chengdu 610017, Peoples R China
[2] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[3] Univ Illinois, Dept Comp Sci, Chicago, IL 60607 USA
基金
中国国家自然科学基金;
关键词
Robustness; Training; Standards; Perturbation methods; Stars; Optimization; Computer science; Adversarial robustness; adversarial training; unrestricted adversarial examples; standard generalizability;
D O I
10.1109/TPAMI.2024.3400988
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adversarial training (AT) is widely considered as the most promising strategy to defend against adversarial attacks and has drawn increasing interest from researchers. However, the existing AT methods still suffer from two challenges. First, they are unable to handle unrestricted adversarial examples (UAEs), which are built from scratch, as opposed to restricted adversarial examples (RAEs), which are created by adding perturbations bound by an l(p) norm to observed examples. Second, the existing AT methods often achieve adversarial robustness at the expense of standard generalizability (i.e., the accuracy on natural examples) because they make a tradeoff between them. To overcome these challenges, we propose a unique viewpoint that understands UAEs as imperceptibly perturbed unobserved examples. Also, we find that the tradeoff results from the separation of the distributions of adversarial examples and natural examples. Based on these ideas, we propose a novel AT approach called Provable Unrestricted Adversarial Training (PUAT), which can provide a target classifier with comprehensive adversarial robustness against both UAE and RAE, and simultaneously improve its standard generalizability. Particularly, PUAT utilizes partially labeled data to achieve effective UAE generation by accurately capturing the natural data distribution through a novel augmented triple-GAN. At the same time, PUAT extends the traditional AT by introducing the supervised loss of the target classifier into the adversarial loss and achieves the alignment between the UAE distribution, the natural data distribution, and the distribution learned by the classifier, with the collaboration of the augmented triple-GAN. Finally, the solid theoretical analysis and extensive experiments conducted on widely-used benchmarks demonstrate the superiority of PUAT.
引用
收藏
页码:8302 / 8319
页数:18
相关论文
共 50 条
  • [1] Provable Robustness of Adversarial Training for Learning Halfspaces with Noise
    Zou, Difan
    Frei, Spencer
    Gu, Quanquan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [2] Adversarial Training and Provable Robustness: A Tale of Two Objectives
    Fan, Jiameng
    Li, Wenchao
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 7367 - 7376
  • [3] Impact of Adversarial Training on Robustness and Generalizability of Language Models
    Altinisik, Enes
    Sajjad, Hassan
    Sencar, Husrev Taha
    Messaoud, Safa
    Chawla, Sanjay
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 7828 - 7840
  • [4] Robustra: Training Provable Robust Neural Networks over Reference Adversarial Space
    Li, Linyi
    Zhong, Zexuan
    Li, Bo
    Xie, Tao
    PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4711 - 4717
  • [5] Scaling provable adversarial defenses
    Wong, Eric
    Schmidt, Frank R.
    Metzen, Jan Hendrik
    Kolter, J. Zico
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [6] Invariant Representations without Adversarial Training
    Moyer, Daniel
    Gao, Shuyang
    Brekelmans, Rob
    Steeg, Greg Ver
    Galstyan, Aram
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [7] On The Generation of Unrestricted Adversarial Examples
    Khoshpasand, Mehrgan
    Ghorbani, Ali
    50TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS WORKSHOPS (DSN-W 2020), 2020, : 9 - 15
  • [8] Training without training data: Improving the generalizability of automated medical abbreviation disambiguation
    Skreta, Marta
    Arbabi, Aryan
    Wang, Jixuan
    Brudno, Michael
    MACHINE LEARNING FOR HEALTH WORKSHOP, VOL 116, 2019, 116 : 233 - 245
  • [9] Provable Adversarial Robustness in the Quantum Model
    Barooti, Khashayar
    Gluch, Grzegorz
    Urbanke, Ruediger
    arXiv, 2021,
  • [10] Efficient Adversarial Defense without Adversarial Training: A Batch Normalization Approach
    Zhu, Yao
    Wei, Xiao
    Zhu, Yue
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,