Sample Efficiency of Data Augmentation Consistency Regularization

被引:0
|
作者
Yang, Shuo [1 ]
Dong, Yijun [1 ]
Ward, Rachel [1 ]
Dhillon, Inderjit S. [1 ]
Sanghavi, Sujay [1 ]
Lei, Qi [2 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
[2] NYU, New York, NY 10003 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data augmentation is popular in the training of large neural networks; however, currently, theoretical understanding of the discrepancy between different algorithmic choices of leveraging augmented data remains limited. In this paper, we take a step in this direction - we first present a simple and novel analysis for linear regression with label invariant augmentations, demonstrating that data augmentation consistency (DAC) is intrinsically more efficient than empirical risk minimization on augmented data (DA-ERM). The analysis is then generalized to misspecified augmentations (i.e., augmentations that change the labels), which again demonstrates the merit of DAC over DA-ERM. Further, we extend our analysis to non-linear models (e.g., neural networks) and present generalization bounds. Finally, we perform experiments that make a clean and apples-to-apples comparison (i.e., with no extra modeling or data tweaks) between DAC and DA-ERM using CIFAR-100 and WideResNet; these together demonstrate the superior efficacy of DAC.
引用
收藏
页数:29
相关论文
共 50 条
  • [21] Toward Learning Robust and Invariant Representations with Alignment Regularization and Data Augmentation
    Wang, Haohan
    Huang, Zeyi
    Wu, Xindi
    Xing, Eric
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 1846 - 1856
  • [22] SAPAUGMENT: LEARNING A SAMPLE ADAPTIVE POLICY FOR DATA AUGMENTATION
    Hu, Ting-Yao
    Shrivastava, Ashish
    Chang, Jen-Hao Rick
    Koppula, Hema
    Braun, Stefan
    Hwang, Kyuyeon
    Kalinli, Ozlem
    Tuzel, Oncel
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 4040 - 4044
  • [23] Sequence-Level Mixed Sample Data Augmentation
    Guo, Demi
    Kim, Yoon
    Rush, Alexander M.
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 5547 - 5552
  • [24] SelectAugment: Hierarchical Deterministic Sample Selection for Data Augmentation
    Lin, Shiqi
    Zhang, Zhizheng
    Li, Xin
    Chen, Zhibo
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 2, 2023, : 1604 - 1612
  • [25] Data Augmentation for Sample Efficient and Robust Document Ranking
    Anand, Abhijit
    Leonhardt, Jurek
    Singh, Jaspreet
    Rudra, Koustav
    Anand, Avishek
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 42 (05)
  • [26] Consistency Regularization for Adversarial Robustness
    Tack, Jihoon
    Yu, Sihyun
    Jeong, Jongheon
    Kim, Minseon
    Hwang, Sung Ju
    Shin, Jinwoo
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8414 - 8422
  • [27] Improved Consistency Regularization for GANs
    Zhao, Zhengli
    Singh, Sameer
    Lee, Honglak
    Zhang, Zizhao
    Odena, Augustus
    Zhang, Han
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 11033 - 11041
  • [28] Semantic Consistency: The Key to Improve Traffic Light Detection with Data Augmentation
    Hassan, Eman T.
    Li, Nanxiang
    Ren, Liu
    2020 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2020, : 1734 - 1739
  • [29] Nearest-Neighbor Sample Compression: Efficiency, Consistency, Infinite Dimensions
    Kontorovich, Aryeh
    Sabato, Sivan
    Weiss, Roi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [30] Consistency of the Tikhonov's regularization in an ill-posed problem with random data
    Dahmani, Abdelnasser
    Saidi, Ahmed Ait
    Bouhmila, Fatah
    Aissani, Mouloud
    STATISTICS & PROBABILITY LETTERS, 2009, 79 (06) : 722 - 727