Improved OOD Generalization via Adversarial Training and Pre-training

被引:0
|
作者
Yi, Mingyangi [1 ,2 ]
Hou, Lu [3 ]
Sun, Jiacheng [3 ]
Shang, Lifeng [3 ]
Jiang, Xin [3 ]
Liu, Qun [3 ]
Ma, Zhi-Ming [1 ,2 ]
机构
[1] Univ Chinese Acad Sci, Beijing, Peoples R China
[2] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
[3] Huawei Noahs Ark Lab, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, learning a model that generalizes well on out-of-distribution (OOD) data has attracted great attention in the machine learning community. In this paper, after defining OOD generalization via Wasserstein distance, we theoretically show that a model robust to input perturbation generalizes well on OOD data. Inspired by previous findings that adversarial training helps improve input-robustness, we theoretically show that adversarially trained models have converged excess risk on OOD data, and empirically verify it on both image classification and natural language understanding tasks. Besides, in the paradigm of first pre-training and then fine-tuning, we theoretically show that a pre-trained model that is more robust to input perturbation provides a better initialization for generalization on downstream OOD data. Empirically, after fine-tuning, this better-initialized model from adversarial pre-training also has better OOD generalization.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Better Representations via Adversarial Training in Pre-Training: A Theoretical Perspective
    Xing, Yue
    Lin, Xiaofeng
    Song, Qifan
    Xu, Yi
    Zeng, Belinda
    Cheng, Guang
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
  • [2] LogiGAN: Learning Logical Reasoning via Adversarial Pre-training
    Pi, Xinyu
    Zhong, Wanjun
    Gao, Yan
    Duan, Nan
    Lou, Jian-Guang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [3] Poster: Boosting Adversarial Robustness by Adversarial Pre-training
    Xu, Xiaoyun
    Picek, Stjepan
    [J]. PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023, 2023, : 3540 - 3542
  • [4] Adversarial momentum-contrastive pre-training
    Xu, Cong
    Li, Dan
    Yang, Min
    [J]. PATTERN RECOGNITION LETTERS, 2022, 160 : 172 - 179
  • [5] Pre-training via Paraphrasing
    Lewis, Mike
    Ghazvininejad, Marjan
    Ghosh, Gargi
    Aghajanyan, Armen
    Wang, Sida
    Zettlemoyer, Luke
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [6] Robust Pre-Training by Adversarial Contrastive Learning
    Jiang, Ziyu
    Chen, Tianlong
    Chen, Ting
    Wang, Zhangyang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [7] A Broad Study of Pre-training for Domain Generalization and Adaptation
    Kim, Donghyun
    Wang, Kaihong
    Sclaroff, Stan
    Saenko, Kate
    [J]. COMPUTER VISION - ECCV 2022, PT XXXIII, 2022, 13693 : 621 - 638
  • [8] Robustness and Generalization via Generative Adversarial Training
    Poursaeed, Omid
    Jiang, Tianxing
    Yang, Harry
    Belongie, Serge
    Lim, Ser-Nam
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15691 - 15700
  • [9] Bridge Pre-Training and Clustering: A Unified Contrastive Learning Framework for OOD Intent Discovery
    Mou, Yutao
    Xu, Heyang
    [J]. IEEE ACCESS, 2023, 11 : 63714 - 63724
  • [10] Improved generalization via tolerant training
    Street, WN
    Mangasarian, OL
    [J]. JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1998, 96 (02) : 259 - 279