Jaynes machine: The universal microstructure of deep neural networks

被引:1
|
作者
Venkatasubramanian, Venkat [1 ]
Sanjeevrajan, N. [1 ]
Khandekar, Manasi [2 ]
Sivaram, Abhishek [3 ]
Szczepanski, Collin [1 ]
机构
[1] Columbia Univ, Dept Chem Engn, Complex Resilient Intelligent Syst Lab, New York, NY 10027 USA
[2] Columbia Univ, Dept Comp Sci & Engn, New York, NY 10027 USA
[3] Tech Univ Denmark, Dept Chem & Biochem Engn, DK-2800 Lyngby, Denmark
关键词
LLMs; Boltzmann machine; Hopfield networks; Game theory; Arbitrage equilibrium; Deep learning; DESIGN; SYSTEMS;
D O I
10.1016/j.compchemeng.2024.108908
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Despite the recent stunning progress in large-scale deep neural network applications, our understanding of their microstructure, 'energy' functions, and optimal design remains incomplete. Here, we present anew game- theoretic framework, called statistical teleodynamics, that reveals important insights into these key properties. The optimally robust design of such networks inherently involves computational benefit-cost trade-offs that physics-inspired models do not adequately capture. These trade-offs occur as neurons and connections compete to increase their effective utilities under resource constraints during training. Ina fully trained network, this results in a state of arbitrage equilibrium, where all neurons in a given layer have the same effective utility, and all connections to a given layer have the same effective utility. The equilibrium is characterized by the emergence of two lognormal distributions of connection weights and neuronal output as the universal microstructure of large deep neural networks. We call such a network the Jaynes Machine. Our theoretical predictions are shown to be supported by empirical data from seven large-scale deep neural networks. We also show that the Hopfield network and the Boltzmann Machine are the same special case of the Jaynes Machine.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Universal Source Coding of Deep Neural Networks
    Basu, Sourya
    Varshney, Lav R.
    2017 DATA COMPRESSION CONFERENCE (DCC), 2017, : 310 - 319
  • [2] Universal Consistency of Deep Convolutional Neural Networks
    Lin, Shao-Bo
    Wang, Kaidong
    Wang, Yao
    Zhou, Ding-Xuan
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2022, 68 (07) : 4610 - 4617
  • [3] Universal Approximation Property of Hamiltonian Deep Neural Networks
    Zakwan, Muhammad
    d'Angelo, Massimiliano
    Ferrari-Trecate, Giancarlo
    IEEE CONTROL SYSTEMS LETTERS, 2023, 7 : 2689 - 2694
  • [4] Generalizing universal adversarial perturbations for deep neural networks
    Yanghao Zhang
    Wenjie Ruan
    Fu Wang
    Xiaowei Huang
    Machine Learning, 2023, 112 : 1597 - 1626
  • [5] DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks
    Wiedemann, Simon
    Kirchhoffer, Heiner
    Matlage, Stefan
    Haase, Paul
    Marban, Arturo
    Marinc, Talmaj
    Neumann, David
    Nguyen, Tung
    Schwarz, Heiko
    Wiegand, Thomas
    Marpe, Detlev
    Samek, Wojciech
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (04) : 700 - 714
  • [6] Generalizing universal adversarial perturbations for deep neural networks
    Zhang, Yanghao
    Ruan, Wenjie
    Wang, Fu
    Huang, Xiaowei
    MACHINE LEARNING, 2023, 112 (05) : 1597 - 1626
  • [7] Cocktail Universal Adversarial Attack on Deep Neural Networks
    Li, Shaoxin
    Li, Xiaofeng
    Che, Xin
    Li, Xintong
    Zhang, Yong
    Chu, Lingyang
    COMPUTER VISION - ECCV 2024, PT LXV, 2025, 15123 : 396 - 412
  • [8] Universal and Succinct Source Coding of Deep Neural Networks
    Basu S.
    Varshney L.R.
    IEEE Journal on Selected Areas in Information Theory, 2022, 3 (04): : 732 - 745
  • [9] Deep Neural Networks in Machine Translation: An Overview
    Zhang, Jiajun
    Zong, Chengqing
    IEEE INTELLIGENT SYSTEMS, 2015, 30 (05) : 16 - 25
  • [10] Advances in Machine Learning and Deep Neural Networks
    Chellappa, Rama
    Theodoridis, Sergios
    van Schaik, Andre
    PROCEEDINGS OF THE IEEE, 2021, 109 (05) : 607 - 611