Sample Efficiency of Data Augmentation Consistency Regularization

被引:0
|
作者
Yang, Shuo [1 ]
Dong, Yijun [1 ]
Ward, Rachel [1 ]
Dhillon, Inderjit S. [1 ]
Sanghavi, Sujay [1 ]
Lei, Qi [2 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
[2] NYU, New York, NY 10003 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data augmentation is popular in the training of large neural networks; however, currently, theoretical understanding of the discrepancy between different algorithmic choices of leveraging augmented data remains limited. In this paper, we take a step in this direction - we first present a simple and novel analysis for linear regression with label invariant augmentations, demonstrating that data augmentation consistency (DAC) is intrinsically more efficient than empirical risk minimization on augmented data (DA-ERM). The analysis is then generalized to misspecified augmentations (i.e., augmentations that change the labels), which again demonstrates the merit of DAC over DA-ERM. Further, we extend our analysis to non-linear models (e.g., neural networks) and present generalization bounds. Finally, we perform experiments that make a clean and apples-to-apples comparison (i.e., with no extra modeling or data tweaks) between DAC and DA-ERM using CIFAR-100 and WideResNet; these together demonstrate the superior efficacy of DAC.
引用
收藏
页数:29
相关论文
共 50 条
  • [1] Learning Augmentation for GNNs With Consistency Regularization
    Park, Hyeonjin
    Lee, Seunghun
    Hwang, Dasol
    Jeong, Jisu
    Kim, Kyung-Min
    Ha, Jung-Woo
    Kim, Hyunwoo J.
    IEEE ACCESS, 2021, 9 : 127961 - 127972
  • [2] AugMixSpeech: A Data Augmentation Method and Consistency Regularization for Mandarin Automatic Speech Recognition
    Jiang, Yang
    Chen, Jun
    Han, Kai
    Liu, Yi
    Ma, Siqi
    Song, Yuqing
    Liu, Zhe
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT III, NLPCC 2024, 2025, 15361 : 145 - 157
  • [3] A study on the performance improvement of learning based on consistency regularization and unlabeled data augmentation
    Kim, Hyunwoong
    Seok, Kyungha
    KOREAN JOURNAL OF APPLIED STATISTICS, 2021, 34 (02) : 167 - 175
  • [4] Augmentation, Mixing, and Consistency Regularization for Domain Generalization
    Mehmood, Noaman
    Barner, Kenneth
    2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,
  • [5] Augmentation-induced Consistency Regularization for Classification
    Wu, Jianhan
    Si, Shijing
    Wang, Jianzong
    Xiao, Jing
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [6] Real Sample Consistency Regularization for GANs
    Zhang, Xiangde
    Zhang, Jian
    ENTROPY, 2021, 23 (09)
  • [7] Using Data Augmentation and Consistency Regularization to Improve Semi-supervised Speech Recognition
    Sapru, Ashtosh
    INTERSPEECH 2022, 2022, : 5115 - 5119
  • [8] Anti-adversarial Consistency Regularization for Data Augmentation: Applications to Robust Medical Image Segmentation
    Cho, Hyuna
    Han, Yubin
    Kim, Won Hwa
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT IV, 2023, 14223 : 555 - 566
  • [9] Selecting the regularization parameters in high-dimensional panel data models: Consistency and efficiency
    Ando, Tomohiro
    Bai, Jushan
    ECONOMETRIC REVIEWS, 2018, 37 (03) : 183 - 211
  • [10] Unsupervised Data Augmentation for Consistency Training
    Xie, Qizhe
    Dai, Zihang
    Hovy, Eduard
    Luong, Minh-Thang
    Le, Quoc V.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33