Overcoming Catastrophic Forgetting for Fine-Tuning Pre-trained GANs

被引:0
|
作者
Zhang, Zeren [1 ]
Li, Xingjian [2 ]
Hong, Tao [1 ]
Wang, Tianyang [3 ]
Ma, Jinwen [1 ]
Xiong, Haoyi [2 ]
Xu, Cheng-Zhong [4 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing 100871, Peoples R China
[2] Baidu Inc, Beijing, Peoples R China
[3] Univ Alabama Birmingham, Birmingham, AL 35294 USA
[4] Univ Macau, Macau, Peoples R China
关键词
Transfer Learning; Generative Adversarial Networks;
D O I
10.1007/978-3-031-43424-2_18
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The great transferability of DNNs has induced a popular paradigm of "pre-training & fine-tuning", by which a data-scarce task can be performed much more easily. However, compared to the existing efforts made in the context of supervised transfer learning, fewer explorations have been made on effectively fine-tuning pre-trained Generative Adversarial Networks (GANs). As reported in recent empirical studies, fine-tuning GANs faces the similar challenge of catastrophic forgetting as in supervised transfer learning. This causes a severe capacity loss of the pre-trained model when adapting it to downstream datasets. While most existing approaches suggest to directly interfere parameter updating, this paper introduces novel schemes from another perspective, i.e. inputs and features, thus essentially focuses on data aspect. Firstly, we adopt a trust-region method to smooth the adaptation dynamics by progressively adjusting input distributions, aiming to avoid dramatic parameter changes, especially when the pre-trained GAN has no information of target data. Secondly, we aim to avoid the loss of the diversity of the generated results of the fine-tuned GAN. This is achieved by explicitly encouraging generated images to encompass diversified spectral components in their deep features. We theoretically study the rationale of the proposed schemes, and conduct extensive experiments on popular transfer learning benchmarks to demonstrate the superiority of the schemes. The code and corresponding supplemental materials are available at https://github.com/zezeze97/Transfer-Pretrained-Gan.
引用
收藏
页码:293 / 308
页数:16
相关论文
共 50 条
  • [1] Pruning Pre-trained Language ModelsWithout Fine-Tuning
    Jiang, Ting
    Wang, Deqing
    Zhuang, Fuzhen
    Xie, Ruobing
    Xia, Feng
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 594 - 605
  • [2] Span Fine-tuning for Pre-trained Language Models
    Bao, Rongzhou
    Zhang, Zhuosheng
    Zhao, Hai
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 1970 - 1979
  • [3] Waste Classification by Fine-Tuning Pre-trained CNN and GAN
    Alsabei, Amani
    Alsayed, Ashwaq
    Alzahrani, Manar
    Al-Shareef, Sarah
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2021, 21 (08): : 65 - 70
  • [4] Variational Monte Carlo on a Budget - Fine-tuning pre-trained NeuralWavefunctions
    Scherbela, Michael
    Gerard, Leon
    Grohs, Philipp
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Debiasing Pre-Trained Language Models via Efficient Fine-Tuning
    Gira, Michael
    Zhang, Ruisu
    Lee, Kangwook
    [J]. PROCEEDINGS OF THE SECOND WORKSHOP ON LANGUAGE TECHNOLOGY FOR EQUALITY, DIVERSITY AND INCLUSION (LTEDI 2022), 2022, : 59 - 69
  • [6] Fine-Tuning Pre-Trained CodeBERT for Code Search in Smart Contract
    JIN Huan
    LI Qinying
    [J]. Wuhan University Journal of Natural Sciences, 2023, 28 (03) : 237 - 245
  • [7] Exploiting Syntactic Information to Boost the Fine-tuning of Pre-trained Models
    Liu, Chaoming
    Zhu, Wenhao
    Zhang, Xiaoyu
    Zhai, Qiuhong
    [J]. 2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 575 - 582
  • [8] Pathologies of Pre-trained Language Models in Few-shot Fine-tuning
    Chen, Hanjie
    Zheng, Guoqing
    Awadallah, Ahmed Hassan
    Ji, Yangfeng
    [J]. PROCEEDINGS OF THE THIRD WORKSHOP ON INSIGHTS FROM NEGATIVE RESULTS IN NLP (INSIGHTS 2022), 2022, : 144 - 153
  • [9] Fine-tuning the hyperparameters of pre-trained models for solving multiclass classification problems
    Kaibassova, D.
    Nurtay, M.
    Tau, A.
    Kissina, M.
    [J]. COMPUTER OPTICS, 2022, 46 (06) : 971 - 979
  • [10] Revisiting k-NN for Fine-Tuning Pre-trained Language Models
    Li, Lei
    Chen, Jing
    Tian, Botzhong
    Zhang, Ningyu
    [J]. CHINESE COMPUTATIONAL LINGUISTICS, CCL 2023, 2023, 14232 : 327 - 338