Sample Efficiency of Data Augmentation Consistency Regularization

被引:0
|
作者
Yang, Shuo [1 ]
Dong, Yijun [1 ]
Ward, Rachel [1 ]
Dhillon, Inderjit S. [1 ]
Sanghavi, Sujay [1 ]
Lei, Qi [2 ]
机构
[1] Univ Texas Austin, Austin, TX 78712 USA
[2] NYU, New York, NY 10003 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data augmentation is popular in the training of large neural networks; however, currently, theoretical understanding of the discrepancy between different algorithmic choices of leveraging augmented data remains limited. In this paper, we take a step in this direction - we first present a simple and novel analysis for linear regression with label invariant augmentations, demonstrating that data augmentation consistency (DAC) is intrinsically more efficient than empirical risk minimization on augmented data (DA-ERM). The analysis is then generalized to misspecified augmentations (i.e., augmentations that change the labels), which again demonstrates the merit of DAC over DA-ERM. Further, we extend our analysis to non-linear models (e.g., neural networks) and present generalization bounds. Finally, we perform experiments that make a clean and apples-to-apples comparison (i.e., with no extra modeling or data tweaks) between DAC and DA-ERM using CIFAR-100 and WideResNet; these together demonstrate the superior efficacy of DAC.
引用
收藏
页数:29
相关论文
共 50 条
  • [31] NICE: NoIse-modulated Consistency rEgularization for Data-Efficient GANs
    Ni, Yao
    Koniusz, Piotr
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [32] The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective
    Lin, Chi-Heng
    Kaushik, Chiraag
    Dyer, Eva L.
    Muthukumar, Vidya
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [33] Hard Regularization to Prevent Deep Online Clustering Collapse without Data Augmentation
    Mahon, Louis
    Lukasiewicz, Thomas
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 13, 2024, : 14281 - 14288
  • [34] Efficiency of DMUs with interval data under the hypotheses of weak data consistency
    Wang, Jiefang
    Liu, Sifeng
    ACHIEVEMENTS IN ENGINEERING MATERIALS, ENERGY, MANAGEMENT AND CONTROL BASED ON INFORMATION TECHNOLOGY, PTS 1 AND 2, 2011, 171-172 : 86 - 89
  • [35] End-to-end Adversarial Sample Generation for Data Augmentation
    Liu, Tianyuan
    Sun, Yuqing
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 11359 - 11368
  • [36] Small sample data augmentation and abundances inversion of minerals hyperspectral
    Zhu L.
    Li M.
    Qin K.
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2023, 31 (11): : 1684 - 1690
  • [37] MetaAugment: Sample-Aware Data Augmentation Policy Learning
    Zhou, Fengwei
    Li, Jiawei
    Xie, Chuanlong
    Chen, Fei
    Hong, Lanqing
    Sun, Rui
    Li, Zhenguo
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 11097 - 11105
  • [38] Dynamic Regularization Parameter Optimization of a Sample Estimate of the Correlation Matrix of Observations by the Criterion "Computational Stability - Consistency"
    Skachkov, Valery
    Chepkyi, Viktor
    Efimchikov, Alexander
    Korkin, Oleksandr
    Dudush, Anatoly
    2019 IEEE 2ND UKRAINE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (UKRCON-2019), 2019, : 18 - 23
  • [39] Robust Semi-Supervised Learning With Multi-Consistency and Data Augmentation
    Guo, Jing-Ming
    Sun, Chi-Chia
    Chan, Kuan-Yu
    Liu, Chun-Yu
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 414 - 424
  • [40] Lung parenchyma segmentation based on semantic data augmentation and boundary attention consistency
    Liu, Xinyu
    Shen, Haiting
    Gao, Long
    Guo, Rui
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 80