Is My Neural Net Driven by the MDL Principle?

被引:0
|
作者
Brandao, Eduardo [1 ]
Duffner, Stefan [2 ]
Emonet, Remi [1 ]
Habrard, Amaury [1 ,3 ]
Jacquenet, Francois [1 ]
Sebban, Marc [1 ]
机构
[1] Univ Jean Monnet St Etienne, CNRS, Inst Opt Grad Sch, Lab Hubert Curien,UMR 5516, F-42023 St Etienne, France
[2] Univ Lyon, CNRS, INSA Lyon, LIRIS,UMR5205, F-69621 Villeurbanne, France
[3] Inst Univ France IUF, Paris, France
关键词
Neural Networks; MDL; Signal-Noise; Point Jacobians; MINIMUM DESCRIPTION LENGTH;
D O I
10.1007/978-3-031-43415-0_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Minimum Description Length principle (MDL) is a formalization of Occam's razor for model selection, which states that a good model is one that can losslessly compress the data while including the cost of describing the model itself. While MDL can naturally express the behavior of certain models such as autoencoders (that inherently compress data) most representation learning techniques do not rely on such models. Instead, they learn representations by training on general or, for self-supervised learning, pretext tasks. In this paper, we propose a new formulation of the MDL principle that relies on the concept of signal and noise, which are implicitly defined by the learning task at hand. Additionally, we introduce ways to empirically measure the complexity of the learned representations by analyzing the spectra of the point Jacobians. Under certain assumptions, we show that the singular values of the point Jacobians of Neural Networks driven by the MDL principle should follow either a power law or a lognormal distribution. Finally, we conduct experiments to evaluate the behavior of the proposed measure applied to deep neural networks on different datasets, with respect to several types of noise. We observe that the experimental spectral distribution is in agreement with the spectral distribution predicted by our MDL principle, which suggests that neural networks trained with gradient descent on noisy data implicitly abide the MDL principle.
引用
收藏
页码:173 / 189
页数:17
相关论文
共 50 条
  • [31] Principle-Driven Fiber Transmission Model Based on PINN Neural Network
    Zang, Yubin
    Yu, Zhenming
    Xu, Kun
    Lan, Xingzeng
    Chen, Minghua
    Yang, Sigang
    Chen, Hongwei
    JOURNAL OF LIGHTWAVE TECHNOLOGY, 2022, 40 (02) : 404 - 414
  • [32] Extended MDL principle for feature-based inductive transfer learning
    Shao, Hao
    Tong, Bin
    Suzuki, Einoshin
    KNOWLEDGE AND INFORMATION SYSTEMS, 2013, 35 (02) : 365 - 389
  • [33] On the MDL principle for i.i.d. sources with large alphabets
    Shamir, GI
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (05) : 1939 - 1955
  • [34] CORRESPONDENCE PRINCIPLE FOR THE QUANTUM NET
    SELESNICK, SA
    INTERNATIONAL JOURNAL OF THEORETICAL PHYSICS, 1991, 30 (10) : 1273 - 1292
  • [35] Covariance Matrix Estimation with Multi-Regularization Parameters based on MDL Principle
    Xiuling Zhou
    Ping Guo
    C. L. Philip Chen
    Neural Processing Letters, 2013, 38 : 227 - 238
  • [36] Covariance Matrix Estimation with Multi-Regularization Parameters based on MDL Principle
    Zhou, Xiuling
    Guo, Ping
    Chen, C. L. Philip
    NEURAL PROCESSING LETTERS, 2013, 38 (02) : 227 - 238
  • [37] Principle driven osteopathy
    McChesney, Ben D.
    INTERNATIONAL JOURNAL OF OSTEOPATHIC MEDICINE, 2013, 16 (01) : 62 - 63
  • [38] New paradigm of learnable computer vision algorithms based on the representational MDL principle
    Potapov, Alexey S.
    Malyshev, Igor A.
    Puysha, Alexander E.
    Averkin, Anton N.
    AUTOMATIC TARGET RECOGNITION XX; ACQUISITION, TRACKING, POINTING, AND LASER SYSTEMS TECHNOLOGIES XXIV; AND OPTICAL PATTERN RECOGNITION XXI, 2010, 7696
  • [39] Unsupervised statistical adaptive segmentation of brain MR images using the MDL principle
    Kim, TW
    Paik, CH
    PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 20, PTS 1-6: BIOMEDICAL ENGINEERING TOWARDS THE YEAR 2000 AND BEYOND, 1998, 20 : 617 - 620
  • [40] Signal estimation using adapted tree-structured bases and the MDL principle
    Moulin, P
    PROCEEDINGS OF THE IEEE-SP INTERNATIONAL SYMPOSIUM ON TIME-FREQUENCY AND TIME-SCALE ANALYSIS, 1996, : 141 - 143