Improved OOD Generalization via Adversarial Training and Pre-training

被引:0
|
作者
Yi, Mingyangi [1 ,2 ]
Hou, Lu [3 ]
Sun, Jiacheng [3 ]
Shang, Lifeng [3 ]
Jiang, Xin [3 ]
Liu, Qun [3 ]
Ma, Zhi-Ming [1 ,2 ]
机构
[1] Univ Chinese Acad Sci, Beijing, Peoples R China
[2] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
[3] Huawei Noahs Ark Lab, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, learning a model that generalizes well on out-of-distribution (OOD) data has attracted great attention in the machine learning community. In this paper, after defining OOD generalization via Wasserstein distance, we theoretically show that a model robust to input perturbation generalizes well on OOD data. Inspired by previous findings that adversarial training helps improve input-robustness, we theoretically show that adversarially trained models have converged excess risk on OOD data, and empirically verify it on both image classification and natural language understanding tasks. Besides, in the paradigm of first pre-training and then fine-tuning, we theoretically show that a pre-trained model that is more robust to input perturbation provides a better initialization for generalization on downstream OOD data. Empirically, after fine-tuning, this better-initialized model from adversarial pre-training also has better OOD generalization.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Unsupervised Point Cloud Pre-training via Occlusion Completion
    Wang, Hanchen
    Liu, Qi
    Yue, Xiangyu
    Lasenby, Joan
    Kusner, Matt J.
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9762 - 9772
  • [42] Understanding the Effects of Pre-Training for Object Detectors via Eigenspectrum
    Shinya, Yosuke
    Simo-Serra, Edgar
    Suzuki, Taiji
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1931 - 1941
  • [43] UNSUPERVISED POINT CLOUD PRE-TRAINING VIA CONTRASTING AND CLUSTERING
    Mei, Guofeng
    Huang, Xiaoshui
    Liu, Juan
    Zhang, Jian
    Wu, Qiang
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 66 - 70
  • [44] Ponder: Point Cloud Pre-training via Neural Rendering
    Huang, Di
    Peng, Sida
    He, Tong
    Yang, Honghui
    Zhou, Xiaowei
    Ouyang, Wanli
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 16043 - 16052
  • [45] Improving Knowledge Tracing via Pre-training Question Embeddings
    Liu, Yunfei
    Yang, Yang
    Chen, Xianyu
    Shen, Jian
    Zhang, Haifeng
    Yu, Yong
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 1556 - 1562
  • [46] Improved Text Classification via Contrastive Adversarial Training
    Pan, Lin
    Hang, Chung-Wei
    Sil, Avi
    Potdar, Saloni
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 11130 - 11138
  • [47] On Generalization of Graph Autoencoders with Adversarial Training
    Huang, Tianjin
    Pei, Yulong
    Menkovski, Vlado
    Pechenizkiy, Mykola
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT II, 2021, 12976 : 367 - 382
  • [48] Dialogue Specific Pre-training Tasks for Improved Dialogue State Tracking
    An, Jinwon
    Kim, Misuk
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (06) : 7761 - 7776
  • [49] Improved Fine-Tuning by Better Leveraging Pre-Training Data
    Liu, Ziquan
    Xu, Yi
    Xu, Yuanhong
    Qian, Qi
    Li, Hao
    Ji, Xiangyang
    Chan, Antoni B.
    Jin, Rong
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [50] Dialogue Specific Pre-training Tasks for Improved Dialogue State Tracking
    Jinwon An
    Misuk Kim
    [J]. Neural Processing Letters, 2023, 55 : 7761 - 7776