Beyond the Storage Capacity: Data-Driven Satisfiability Transition

被引:13
|
作者
Rotondo, Pietro [1 ,2 ]
Pastore, Mauro [1 ,2 ]
Gherardi, Marco [1 ,2 ]
机构
[1] Ist Nazl Fis Nucl, Sez Milano, Via Celoria 16, I-20133 Milan, Italy
[2] Univ Milan, Via Celoria 16, I-20133 Milan, Italy
基金
欧盟地平线“2020”;
关键词
Data structure has a dramatic impact on the properties of neural networks; yet its significance in the established theoretical frameworks is poorly understood. Here we compute the Vapnik-Chervonenkis entropy of a kernel machine operating on data grouped into equally labeled subsets. At variance with the unstructured scenario; entropy is nonmonotonic in the size of the training set; and displays an additional critical point besides the storage capacity. Remarkably; the same behavior occurs in margin classifiers even with randomly labeled data; as is elucidated by identifying the synaptic volume encoding the transition. These findings reveal aspects of expressivity lying beyond the condensed description provided by the storage capacity; and they indicate the path towards more realistic bounds for the generalization error of neural networks. © 2020 American Physical Society;
D O I
10.1103/PhysRevLett.125.120601
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Data structure has a dramatic impact on the properties of neural networks, yet its significance in the established theoretical frameworks is poorly understood. Hem we compute the Vapnik-Chervonenkis entropy of a kernel machine operating on data grouped into equally labeled subsets. At variance with the unstructured scenario, entropy is nonmonotonic in the size of the training set, and displays an additional critical point besides the storage capacity. Remarkably, the same behavior occurs in margin classifiers even with randomly labeled data, as is elucidated by identifying the synaptic volume encoding the transition. These findings reveal aspects of expressivity lying beyond the condensed description provided by the storage capacity, and they indicate the path towards more realistic bounds for the generalization error of neural networks.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Beyond IID: data-driven decision-making in heterogeneous environments
    Besbes, Omar
    Ma, Will
    Mouchtaki, Omar
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [42] Capacity of Virtual Energy Storage System for Frequency Regulation Services via a Data-Driven Distributionally Robust Optimization Method
    Saberi, Hossein
    Zhang, Cuo
    Dong, Zhao Yang
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2023, 38 (03) : 2134 - 2147
  • [43] Capacity Allocation Method Based on Historical Data-Driven Search Algorithm for Integrated PV and Energy Storage Charging Station
    Pan, Xiaogang
    Liu, Kangli
    Wang, Jianhua
    Hu, Yutao
    Zhao, Jianfeng
    SUSTAINABILITY, 2023, 15 (06)
  • [44] Data-Driven Agent-Based Simulation for Pedestrian Capacity Analysis
    Tan, Sing Kuang
    Hu, Nan
    Cai, Wentong
    COMPUTATIONAL SCIENCE - ICCS 2018, PT II, 2018, 10861 : 103 - 116
  • [45] Data-Driven Analysis and Evaluation of Regional Resources and the Environmental Carrying Capacity
    Lin, Aiyong
    Liu, Yujia
    Zhou, Shuling
    Zhang, Yajie
    Wang, Cui
    Ding, Heping
    SUSTAINABILITY, 2023, 15 (10)
  • [46] Interpretable Data-Driven Capacity Estimation of Lithium-ion Batteries
    Wang, Yixiu
    Kumar, Anurakt
    Ren, Jiayang
    You, Pufan
    Seth, Arpan
    Gopaluni, R. Bhushan
    Cao, Yankai
    IFAC PAPERSONLINE, 2024, 58 (14): : 139 - 144
  • [47] Data-driven prediction of battery cycle life before capacity degradation
    Severson, Kristen A.
    Attia, Peter M.
    Jin, Norman
    Perkins, Nicholas
    Jiang, Benben
    Yang, Zi
    Chen, Michael H.
    Aykol, Muratahan
    Herring, Patrick K.
    Fraggedakis, Dimitrios
    Bazan, Martin Z.
    Harris, Stephen J.
    Chueh, William C.
    Braatz, Richard D.
    NATURE ENERGY, 2019, 4 (05) : 383 - 391
  • [48] Data-Driven Modeling of a High Capacity Cryogenic System for Control Optimization
    Maldonado, Bryan P.
    Liu, Frank
    Goth, Nolan
    Ramuhalli, Pradeep
    Howell, Matthew
    Maekawa, Ryuji
    Cousineau, Sarah
    IFAC PAPERSONLINE, 2023, 56 (02): : 3986 - 3993
  • [49] Data-driven prediction of battery cycle life before capacity degradation
    Kristen A. Severson
    Peter M. Attia
    Norman Jin
    Nicholas Perkins
    Benben Jiang
    Zi Yang
    Michael H. Chen
    Muratahan Aykol
    Patrick K. Herring
    Dimitrios Fraggedakis
    Martin Z. Bazant
    Stephen J. Harris
    William C. Chueh
    Richard D. Braatz
    Nature Energy, 2019, 4 : 383 - 391
  • [50] Data-Driven Regulation Reserve Capacity Determination Based on Bayes Theorem
    Liu, Likai
    Hu, Zechun
    IEEE TRANSACTIONS ON POWER SYSTEMS, 2020, 35 (02) : 1646 - 1649