Data-Efficient Double-Win Lottery Tickets from Robust Pre-training

被引:0
|
作者
Chen, Tianlong [1 ]
Zhang, Zhenyu [1 ]
Liu, Sijia [2 ,3 ]
Zhang, Yang [3 ]
Chang, Shiyu [4 ]
Wang, Zhangyang [1 ]
机构
[1] Univ Texas Austin, Dept Elect & Comp Engn, Austin, TX 78712 USA
[2] Michigan State Univ, E Lansing, MI 48824 USA
[3] MIT, IBM Watson AI Lab, Cambridge, MA 02139 USA
[4] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-training serves as a broadly adopted starting point for transfer learning on various downstream tasks. Recent investigations of lottery tickets hypothesis (LTH) demonstrate such enormous pre-trained models can be replaced by extremely sparse subnetworks (a.k.a. matching sub networks) without sacrificing transferability. However, practical security-crucial applications usually pose more challenging requirements beyond standard transfer, which also demand these subnetworks to overcome adversarial vulnerability. In this paper, we formulate a more rigorous concept, DoubleWin Lottery Tickets, in which a located subnetwork from a pre-trained model can be independently transferred on diverse downstream tasks, to reach BOTH the same standard and robust generalization, under BOTH standard and adversarial training regimes, as the full pre-trained model can do. We comprehensively examine various pre-training mechanisms and find that robust pretraining tends to craft sparser double-win lottery tickets with superior performance over the standard counterparts. For example, on downstream CIFAR-10/100 datasets, we identify double-win matching subnetworks with the standard, fast adversarial, and adversarial pre-training from ImageNet, at 89.26%/73.79%, 89.26%/79.03%, and 91.41%/83.22% sparsity, respectively. Furthermore, we observe the obtained double-win lottery tickets can be more data-efficient to transfer, under practical data-limited (e.g., 1% and 10%) downstream schemes. Our results show that the benefits from robust pre-training are amplified by the lottery ticket scheme, as well as the data-limited transfer setting. Codes are available at https://github.com/VITA-Group/ Double- Win- LTH.
引用
收藏
页数:13
相关论文
共 24 条
  • [1] Pre-training and Adverse Audio Samples for Data-Efficient Wake Word Detection
    Torralbo, Manuel
    Mendez, Ariane
    Agirre, Maia
    Del Pozo, Arantza
    SPEECH AND COMPUTER, SPECOM 2024, PT I, 2025, 15299 : 104 - 118
  • [2] The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
    Chen, Tianlong
    Frankle, Jonathan
    Chang, Shiyu
    Liu, Sijia
    Zhang, Yang
    Carbin, Michael
    Wang, Zhangyang
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 16301 - 16311
  • [3] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
    Tong, Zhan
    Song, Yibing
    Wang, Jue
    Wang, Limin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [4] Data-Efficient GAN Training Beyond (Just) Augmentations: A Lottery Ticket Perspective
    Chen, Tianlong
    Cheng, Yu
    Gan, Zhe
    Liu, Jingjing
    Wang, Zhangyang
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [5] ELLE: Efficient Lifelong Pre-training for Emerging Data
    Qin, Yujia
    Zhang, Jiajie
    Lin, Yankai
    Liu, Zhiyuan
    Li, Peng
    Sun, Maosong
    Zhou, Jie
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 2789 - 2810
  • [6] Smaller Can Be Better: Efficient Data Selection for Pre-training Models
    Fang, Guang
    Wang, Shihui
    Wang, Mingxin
    Yang, Yulan
    Huang, Hao
    WEB AND BIG DATA, APWEB-WAIM 2024, PT I, 2024, 14961 : 327 - 342
  • [7] INGENIOUS: Using Informative Data Subsets for Efficient Pre-Training of Language Models
    Renduchintala, H. S. V. N. S. Kowndinya
    Killamsetty, Krishnateja
    Bhatia, Sumit
    Aggarwal, Milan
    Ramakrishnan, Ganesh
    Iyer, Rishabh
    Krishnamurthy, Balaji
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 6690 - 6705
  • [8] SoftDedup: an Efficient Data Reweighting Method for Speeding Up Language Model Pre-training
    He, Nan
    Xiong, Weichen
    Liu, Hanwen
    Liao, Yi
    Ding, Lei
    Zhang, Kai
    Tang, Guohua
    Han, Xiao
    Yang, Wei
    PROCEEDINGS OF THE 62ND ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 4011 - 4022
  • [9] Synthesizing efficient data with diffusion models for person re-identification pre-training
    Niu, Ke
    Yu, Haiyang
    Qian, Xuelin
    Fu, Teng
    Li, Bin
    Xue, Xiangyang
    MACHINE LEARNING, 2025, 114 (03)
  • [10] Robust Contrastive Language-Image Pre-training against Data Poisoning and Backdoor Attacks
    Yang, Wenhan
    Gao, Jingdong
    Mirzasoleiman, Baharan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,