Bayesian Invariant Risk Minimization

被引:16
|
作者
Lin, Yong [1 ]
Dong, Hanze [1 ]
Wang, Hao [2 ]
Zhang, Tong [1 ,3 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Rutgers State Univ, New Brunswick, NJ USA
[3] Google Res, Mountain View, CA USA
关键词
INFERENCE;
D O I
10.1109/CVPR52688.2022.01555
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generalization under distributional shift is an open challenge for machine learning. Invariant Risk Minimization (IRM) is a promising framework to tackle this issue by extracting invariant features. However, despite the potential and popularity of IRM, recent works have reported negative results of it on deep models. We argue that the failure can be primarily attributed to deep models' tendency to overfit the data. Specifically, our theoretical analysis shows that IRM degenerates to empirical risk minimization (ERM) when overfitting occurs. Our empirical evidence also provides supports: IRM methods that work well in typical settings significantly deteriorate even if we slightly enlarge the model size or lessen the training data. To alleviate this issue, we propose Bayesian Invariant Risk Minimization (BIRM) by introducing Bayesian inference into the IRM. The key motivation is to estimate the penalty of IRM based on the posterior distribution of classifiers (as opposed to a single classifier), which is much less prone to overfitting. Extensive experimental results on four datasets demonstrate that BIRM consistently outperforms the existing IRM baselines significantly.
引用
收藏
页码:16000 / 16009
页数:10
相关论文
共 50 条
  • [1] Sparse Invariant Risk Minimization
    Zhou, Xiao
    Lin, Yong
    Zhang, Weizhong
    Zhang, Tong
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [2] Invariant Risk Minimization Games
    Ahuja, Kartik
    Shanmugam, Karthikeyan
    Varshney, Kush R.
    Dhurandhar, Amit
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [3] Bayesian Counterfactual Risk Minimization
    London, Ben
    Sandler, Ted
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [4] Does Invariant Risk Minimization Capture Invariance?
    Kamath, Pritish
    Tangella, Akilesh
    Sutherland, Danica J.
    Srebro, Nathan
    [J]. 24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [5] Minimization of Attack Risk with Bayesian Detection Criteria
    Standley, Vaughn H.
    Nuno, Frank G.
    Sharpe, Jacob W.
    [J]. PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON COMPLEXITY, FUTURE INFORMATION SYSTEMS AND RISK (COMPLEXIS), 2019, : 17 - 26
  • [6] Learning Bayesian network classifiers by risk minimization
    Kelner, Roy
    Lerner, Boaz
    [J]. International Journal of Approximate Reasoning, 2012, 53 (02): : 248 - 272
  • [7] Learning Bayesian network classifiers by risk minimization
    Kelner, Roy
    Lerner, Boaz
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2012, 53 (02) : 248 - 272
  • [8] Improving Deepfake Detection Generalization by Invariant Risk Minimization
    Yin, Zixin
    Wang, Jiakai
    Xiao, Yisong
    Zhao, Hanqing
    Li, Tianlin
    Zhou, Wenbo
    Liu, Aishan
    Liu, Xianglong
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 6785 - 6798
  • [9] TREATMENT EFFECT ESTIMATION USING INVARIANT RISK MINIMIZATION
    Shah, Abhin
    Ahuja, Kartik
    Shanmugam, Karthikeyan
    Wei, Dennis
    Varshney, Kush R.
    Dhurandhar, Amit
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5005 - 5009
  • [10] Mail filtering based on the risk minimization Bayesian algorithm
    Lin, YP
    Chen, ZP
    Yang, XL
    Shi, XJ
    [J]. 6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XVII, PROCEEDINGS: INDUSTRIAL SYSTEMS AND ENGINEERING III, 2002, : 282 - 285