On the Trade-off of Intra-/Inter-class Diversity for Supervised Pre-training

被引:0
|
作者
Zhang, Jieyu [1 ]
Wang, Bohan [2 ]
Hu, Zhengyu [3 ]
Koh, Pang Wei [1 ]
Ratner, Alexander [1 ,4 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
[2] USTC, Hefei, Anhui, Peoples R China
[3] HKUST GZ, Hong Kong, Peoples R China
[4] Snorkel AI Inc, Redwood City, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-training datasets are critical for building state-of-the-art machine learning models, motivating rigorous study on their impact on downstream tasks. In this work, we study the impact of the trade-off between the intra-class diversity (the number of samples per class) and the inter-class diversity (the number of classes) of a supervised pre-training dataset. Empirically, given a fixed pre-training dataset size, we find that the best downstream performance comes with a balance on the intra-/inter-class diversity. To understand the underlying mechanism, we show theoretically that downstream performance depends monotonically on both types of diversity. Notably, our theory reveals that the optimal class-to-sample ratio (#classes/#samples per class), i.e., the ratio of the number of pre-training classes to the number of samples per class, is invariant to the size of the pre-training dataset, enabling the prediction of the optimal number of pre-training classes. We demonstrate the effectiveness of this application by an improvement of approximately 2 points on average on downstream tasks when pre-training on ImageNet.
引用
收藏
页数:20
相关论文
共 26 条
  • [21] GM Score: Incorporating Inter-Class and Intra-Class Generator Diversity, Discriminability of Latent Space, and Sample Fidelity for Evaluating GANs
    Harshvardhan, G. M.
    Sahu, Aanchal
    Gourisaria, Mahendra Kumar
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2023, 48 (02) : 2207 - 2230
  • [22] Effect of pre-training scale on intra- and inter-domain, full and few-shot transfer learning for natural and X-Ray chest images
    Cherti, Mehdi
    Jitsev, Jenia
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [23] Intra-modality masked image modeling: A self-supervised pre-training method for brain tumor segmentation
    Qi, Liangce
    Shi, Weili
    Miao, Yu
    Li, Yonghui
    Feng, Guanyuan
    Jiang, Zhengang
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 95
  • [24] Self-supervised pseudo multi-class pre-training for unsupervised anomaly detection and segmentation in medical images
    Tian, Yu
    Liu, Fengbei
    Pang, Guansong
    Chen, Yuanhong
    Liu, Yuyuan
    Verjans, Johan W.
    Singh, Rajvinder
    Carneiro, Gustavo
    MEDICAL IMAGE ANALYSIS, 2023, 90
  • [25] Improving the classification of veterinary thoracic radiographs through inter-species and inter-pathology self-supervised pre-training of deep learning models
    Celniak, Weronika
    Wodzinski, Marek
    Jurgas, Artur
    Burti, Silvia
    Zotti, Alessandro
    Atzori, Manfredo
    Mueller, Henning
    Banzato, Tommaso
    SCIENTIFIC REPORTS, 2023, 13 (01):
  • [26] Improving the classification of veterinary thoracic radiographs through inter-species and inter-pathology self-supervised pre-training of deep learning models
    Weronika Celniak
    Marek Wodziński
    Artur Jurgas
    Silvia Burti
    Alessandro Zotti
    Manfredo Atzori
    Henning Müller
    Tommaso Banzato
    Scientific Reports, 13 (1)