Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization

被引:0
|
作者
Haghifam, Mahdi [1 ,2 ]
Rodriguez-Galvez, Borja [3 ]
Thobaben, Ragnar [3 ]
Skoglund, Mikael [3 ]
Roy, Daniel M. [1 ,2 ]
Dziugaite, Gintare Karolina [4 ,5 ,6 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
[2] Vector Inst, Toronto, ON, Canada
[3] KTH Royal Inst Technol, Stockholm, Sweden
[4] Google Res, Toronto, ON, Canada
[5] Mila, Montreal, PQ, Canada
[6] McGill, Montreal, PQ, Canada
基金
瑞典研究理事会; 加拿大自然科学与工程研究理事会;
关键词
STABILITY;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
To date, no "information-theoretic" frameworks for reasoning about generalization error have been shown to establish minimax rates for gradient descent in the setting of stochastic convex optimization. In this work, we consider the prospect of establishing such rates via several existing information-theoretic frameworks: input-output mutual information bounds, conditional mutual information bounds and variants, PAC-Bayes bounds, and recent conditional variants thereof. We prove that none of these bounds are able to establish minimax rates. We then consider a common tactic employed in studying gradient methods, whereby the final iterate is corrupted by Gaussian noise, producing a noisy "surrogate" algorithm. We prove that minimax rates cannot be established via the analysis of such surrogates. Our results suggest that new ideas are required to analyze gradient descent using information-theoretic techniques.
引用
收藏
页码:663 / 706
页数:44
相关论文
共 50 条
  • [41] Models and Information-Theoretic Bounds for Nanopore Sequencing
    Mao, Wei
    Diggavi, Suhas N.
    Kannan, Sreeram
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2018, 64 (04) : 3216 - 3236
  • [42] Generalization Performance of Multi-pass Stochastic Gradient Descent with Convex Loss Functions
    Lei, Yunwen
    Hu, Ting
    Tang, Ke
    JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [43] Information-Theoretic Confidence Bounds for Reinforcement Learning
    Lu, Xiuyuan
    Van Roy, Benjamin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [44] Information-theoretic bounds on target recognition performance
    Jain, A
    Moulin, P
    Miller, MI
    Ramchandran, K
    AUTOMATIC TARGET RECOGNITION X, 2000, 4050 : 347 - 358
  • [45] Information-theoretic Bounds for Differentially Private Mechanisms
    Barthe, Gilles
    Koepf, Boris
    2011 IEEE 24TH COMPUTER SECURITY FOUNDATIONS SYMPOSIUM (CSF), 2011, : 191 - 204
  • [46] Computer vision studies using stochastic resonance/information-theoretic methods
    Repperger, D. W.
    Roberts, R. G.
    Pinkus, A. R.
    2007 INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN ROBOTICS AND AUTOMATION, 2007, : 59 - +
  • [47] Stability and Generalization of Decentralized Stochastic Gradient Descent
    Sun, Tao
    Li, Dongsheng
    Wang, Bao
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9756 - 9764
  • [48] Information-Theoretic Performance Limitations of Feedback Control: Underlying Entropic Laws and Generic Lp Bounds
    Fang, Song
    Zhu, Quanyan
    2021 AMERICAN CONTROL CONFERENCE (ACC), 2021, : 1281 - 1286
  • [49] An Information-Theoretic View of Generalization via Wasserstein Distance
    Wang, Hao
    Diaz, Mario
    Santos Filho, Jose Candido S.
    Calmon, Flavio P.
    2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 577 - 581
  • [50] Information-Theoretic Characterizations of Generalization Error for the Gibbs Algorithm
    Aminian, Gholamali
    Bu, Yuheng
    Toni, Laura
    Rodrigues, Miguel R. D.
    Wornell, Gregory W.
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (01) : 632 - 655