Assessing Local Generalization Capability in Deep Models

被引:0
|
作者
Wang, Huan [1 ]
Keskar, Nitish Shirish [1 ]
Xiong, Caiming [1 ]
Socher, Richard [1 ]
机构
[1] Salesforce Res, Palo Alto, CA 94301 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While it has not yet been proven, empirical evidence suggests that model generalization is related to local properties of the optima, which can be described via the Hessian. We connect model generalization with the local property of a solution under the PAC-Bayes paradigm. In particular, we prove that model generalization ability is related to the Hessian, the higher-order "smoothness" terms characterized by the Lipschitz constant of the Hessian, and the scales of the parameters. Guided by the proof, we propose a metric to score the generalization capability of a model, as well as an algorithm that optimizes the perturbed model accordingly.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Assessing the generalization capability of deep learning networks for aerial image classification using landscape metrics
    Gevaert, Caroline M.
    Belgiu, Mariana
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 114
  • [2] Assessing the Cross-Market Generalization Capability of the CLAUDETTE System
    Jablonowska, Agnieszka
    Lagioia, Francesca
    Lippi, Marco
    Micklitz, Hans-Wolfgang
    Sartor, Giovanni
    Tagiuri, Giacomo
    LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 346 : 62 - 67
  • [3] GENERALIZATION OF LOCAL AND NON-LOCAL OPTICAL MODELS
    GREENLEE.GW
    TANG, YC
    PHYSICS LETTERS B, 1971, B 34 (05) : 359 - &
  • [4] A Study on the Generalization Capability of Acoustic Models for Robust Speech Recognition
    Xiao, Xiong
    Li, Jinyu
    Chng, Eng Siong
    Li, Haizhou
    Lee, Chin-Hui
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (06): : 1158 - 1169
  • [5] Explainable Deep Classification Models for Domain Generalization
    Zunino, Andrea
    Bargal, Sarah Adel
    Volpi, Riccardo
    Sameki, Mehrnoosh
    Zhang, Jianming
    Sclaroff, Stan
    Murino, Vittorio
    Saenko, Kate
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3227 - 3236
  • [6] Prediction of pulsating turbulent pipe flow by deep learning with generalization capability
    Matsubara, K.
    Mitsuishi, A.
    Iwamoto, K.
    Murata, A.
    INTERNATIONAL JOURNAL OF HEAT AND FLUID FLOW, 2023, 104
  • [7] A framework for assessing capability in organisations using enterprise models
    Romero, Marcelo
    Guedria, Wided
    Panetto, Herve
    Barafort, Beatrix
    JOURNAL OF INDUSTRIAL INFORMATION INTEGRATION, 2022, 27
  • [8] Assessing the Capability of Large Language Models in Naturopathy Consultation
    Mondal, Himel
    Komarraju, Satyalakshmi
    Sathyanath, D.
    Muralidharan, Shrikanth
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (05)
  • [9] A framework for assessing capability in organisations using enterprise models
    Romero, Marcelo
    Guédria, Wided
    Panetto, Hervé
    Barafort, Béatrix
    Journal of Industrial Information Integration, 2022, 27
  • [10] Metrics for Assessing Generalization of Deep Reinforcement Learning in Parameterized Environments
    Aleksandrowicz, Maciej
    Jaworek-Korjakowska, Joanna
    JOURNAL OF ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING RESEARCH, 2023, 14 (01) : 45 - 61