Improved OOD Generalization via Adversarial Training and Pre-training

被引:0
|
作者
Yi, Mingyangi [1 ,2 ]
Hou, Lu [3 ]
Sun, Jiacheng [3 ]
Shang, Lifeng [3 ]
Jiang, Xin [3 ]
Liu, Qun [3 ]
Ma, Zhi-Ming [1 ,2 ]
机构
[1] Univ Chinese Acad Sci, Beijing, Peoples R China
[2] Chinese Acad Sci, Acad Math & Syst Sci, Beijing, Peoples R China
[3] Huawei Noahs Ark Lab, Shenzhen, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, learning a model that generalizes well on out-of-distribution (OOD) data has attracted great attention in the machine learning community. In this paper, after defining OOD generalization via Wasserstein distance, we theoretically show that a model robust to input perturbation generalizes well on OOD data. Inspired by previous findings that adversarial training helps improve input-robustness, we theoretically show that adversarially trained models have converged excess risk on OOD data, and empirically verify it on both image classification and natural language understanding tasks. Besides, in the paradigm of first pre-training and then fine-tuning, we theoretically show that a pre-trained model that is more robust to input perturbation provides a better initialization for generalization on downstream OOD data. Empirically, after fine-tuning, this better-initialized model from adversarial pre-training also has better OOD generalization.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Mitigation of Adversarial Examples in RF Deep Classifiers Utilizing AutoEncoder Pre-training
    Kokalj-Filipovic, Silvija
    Miller, Rob
    Chang, Nicholas
    Lau, C. L.
    [J]. 2019 INTERNATIONAL CONFERENCE ON MILITARY COMMUNICATIONS AND INFORMATION SYSTEMS (ICMCIS), 2019,
  • [22] Retinal microaneurysms detection using adversarial pre-training with unlabeled multimodal images
    Hervella, Alvaro S.
    Rouco, Jose
    Novo, Jorge
    Ortega, Marcos
    [J]. INFORMATION FUSION, 2022, 79 : 146 - 161
  • [23] Contrastive Pre-training with Adversarial Perturbations for Check-in Sequence Representation Learning
    Gong, Letian
    Lin, Youfang
    Guo, Shengnan
    Lin, Yan
    Wang, Tianyi
    Zheng, Erwen
    Zhou, Zeyu
    Wan, Huaiyu
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 4, 2023, : 4276 - 4283
  • [24] Rethinking ImageNet Pre-training
    He, Kaiming
    Girshick, Ross
    Dollar, Piotr
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4917 - 4926
  • [25] Pre-Training to Learn in Context
    Gu, Yuxian
    Dong, Li
    Wei, Furu
    Huang, Minlie
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 4849 - 4870
  • [26] THE PRE-TRAINING SELECTION OF TEACHERS
    Barr, A. S.
    Douglas, Lois
    [J]. JOURNAL OF EDUCATIONAL RESEARCH, 1934, 28 (02): : 92 - 117
  • [27] Improving Fractal Pre-training
    Anderson, Connor
    Farrell, Ryan
    [J]. 2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 2412 - 2421
  • [28] Photo Pre-Training, But for Sketch
    Ke, L.
    Pang, Kaiyue
    Song, Yi-Zhe
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2754 - 2764
  • [29] Pre-training phenotyping classifiers
    Dligach, Dmitriy
    Afshar, Majid
    Miller, Timothy
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2021, 113 (113)
  • [30] Stitching Segments and Sentences towards Generalization in Video-Text Pre-training
    Ma, Fan
    Jin, Xiaojie
    Wang, Heng
    Huang, Jingjia
    Zhu, Linchao
    Yang, Yi
    [J]. THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4080 - 4088