Do Kernel and Neural Embeddings Help in Training and Generalization?

被引:0
|
作者
Rahbar, Arman [1 ]
Jorge, Emilio [1 ]
Dubhashi, Devdatt [1 ]
Chehreghani, Morteza Haghir [1 ]
机构
[1] Chalmers Univ Technol, Dept Comp Sci & Engn, SE-41296 Gothenburg, Sweden
关键词
Kernel embedding; Gram matrix; Neural Network; Convergence;
D O I
10.1007/s11063-022-10958-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent results on optimization and generalization properties of neural networks showed that in a simple two-layer network, the alignment of the labels to the eigenvectors of the corresponding Gram matrix determines the convergence of the optimization during training. Such analyses also provide upper bounds on the generalization error. We experimentally investigate the implications of these results to deeper networks via embeddings. We regard the layers preceding the final hidden layer as producing different representations of the input data which are then fed to the two-layer model. We show that these representations improve both optimization and generalization. In particular, we investigate three kernel representations when fed to the final hidden layer: the Gaussian kernel and its approximation by random Fourier features, kernels designed to imitate representations produced by neural networks and finally an optimal kernel designed to align the data with target labels. The approximated representations induced by these kernels are fed to the neural network and the optimization and generalization properties of the final model are evaluated and compared.
引用
收藏
页码:1681 / 1695
页数:15
相关论文
共 50 条
  • [21] On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models
    Ju, Peizhong
    Lin, Xiaojun
    Shroff, Ness B.
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [22] On the Generalization Power of the Overfitted Three-Layer Neural Tangent Kernel Model
    Ju, Peizhong
    Lin, Xiaojun
    Shroff, Ness B.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [23] Evolutionary training and abstraction yields algorithmic generalization of neural computers
    Tanneberg, Daniel
    Rueckert, Elmar
    Peters, Jan
    NATURE MACHINE INTELLIGENCE, 2020, 2 (12) : 753 - 763
  • [24] Training neural networks to encode symbols enables combinatorial generalization
    Vankov, Ivan I.
    Bowers, Jeffrey S.
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2020, 375 (1791)
  • [25] Training feedforward neural networks: An algorithm giving improved generalization
    Lee, CW
    NEURAL NETWORKS, 1997, 10 (01) : 61 - 68
  • [26] A novel RBF neural network with fast training and accurate generalization
    Wang, Lipo
    Liu, Bing
    Wan, Chunru
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2004, 3314 : 166 - 171
  • [27] Evolutionary training and abstraction yields algorithmic generalization of neural computers
    Daniel Tanneberg
    Elmar Rueckert
    Jan Peters
    Nature Machine Intelligence, 2020, 2 : 753 - 763
  • [28] Training data selection method for generalization by multilayer neural networks
    Hara, K
    Nakayama, K
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1998, E81A (03) : 374 - 381
  • [29] Training neural networks with structured noise improves classification and generalization
    Benedetti, Marco
    Ventura, Enrico
    JOURNAL OF PHYSICS A-MATHEMATICAL AND THEORETICAL, 2024, 57 (41)
  • [30] Inconsistency, Instability, and Generalization Gap of Deep Neural Network Training
    Johnson, Rie
    Zhang, Tong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,