Learning in the Presence of Low-dimensional Structure: A Spiked Random Matrix Perspective

被引:0
|
作者
Ba, Jimmy [1 ,2 ,3 ]
Erdogdu, Murat A. [1 ,2 ]
Suzuki, Taiji [4 ,5 ]
Wang, Zhichao [6 ]
Wu, Denny [7 ,8 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
[2] Vector Inst, Toronto, ON, Canada
[3] xAI, Burlingame, CA USA
[4] Univ Tokyo, Tokyo, Japan
[5] RIKEN AIP, Tokyo, Japan
[6] Univ Calif San Diego, San Diego, CA USA
[7] New York Univ, New York, NY USA
[8] Flatiron Inst, New York, NY USA
基金
加拿大自然科学与工程研究理事会;
关键词
LARGEST EIGENVALUE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider the problem of learning a single-index target function f(*) : R-d -> R under the spiked covariance data: f(*)(x) = (*)(1/root 1+theta < x, mu >), x similar to N(0, I-d + theta mu mu(inverted perpendicular)), theta asymptotic to d(beta) for beta is an element of[0, 1), where the link function sigma(*) : R -> R is a degree-p polynomial with information exponent k (defined as the lowest degree in the Hermite expansion of sigma(*)), and it depends on the projection of input x onto the spike (signal) direction mu is an element of R-d. In the proportional asymptotic limit where the number of training examples n and the dimensionality d jointly diverge: n, d -> infinity, n/d -> psi is an element of (0, infinity), we ask the following question: how large should the spike magnitude theta be, in order for (i) kernel methods, (ii) neural networks optimized by gradient descent, to learn f(*)? We show that for kernel ridge regression, beta >= 1 - 1/p is both sufficient and necessary. Whereas for two-layer neural networks trained with gradient descent, beta > 1 - 1/k suffices. Our results demonstrate that both kernel methods and neural networks benefit from low-dimensional structures in the data. Further, since k <= p by definition, neural networks can adapt to such structures more effectively.
引用
收藏
页数:30
相关论文
共 50 条
  • [41] Low-dimensional heterostructures for advanced electrocatalysis: an experimental and computational perspective
    Ahsan, Md Ariful
    He, Tianwei
    Noveron, Juan C.
    Reuter, Karsten
    Puente-Santiago, Alain R.
    Luque, Rafael
    CHEMICAL SOCIETY REVIEWS, 2022, 51 (03) : 812 - 828
  • [42] Coherent random lasing behavior in low-dimensional active weakly scattering random systems
    Wan, Yuan
    OPTIK, 2016, 127 (22): : 10919 - 10925
  • [43] Vigilance associates with the low-dimensional structure of fMRI data
    Zhang, Shengchao
    Goodale, Sarah E.
    Gold, Benjamin P.
    Morgan, Victoria L.
    Englot, Dario J.
    Chang, Catie
    NEUROIMAGE, 2023, 267
  • [44] Picture of the low-dimensional structure in chaotic dripping faucets
    Kiyono, K
    Katsuyama, T
    Masunaga, T
    Fuchikami, N
    PHYSICS LETTERS A, 2003, 320 (01) : 47 - 52
  • [45] Structure and electronic properties of the low-dimensional copper systems
    Wang, GC
    Yuan, JM
    ACTA PHYSICA SINICA, 2003, 52 (04) : 970 - 977
  • [46] SYNTHESIS, STRUCTURE, AND PROPERTIES OF NOVEL LOW-DIMENSIONAL SOLIDS
    HWU, SJ
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1993, 206 : 552 - INOR
  • [47] Dynamic structure factor of dusty and low-dimensional plasmas
    Wierling, A
    Reinholz, H
    Röpke, G
    Adams, J
    CONTRIBUTIONS TO PLASMA PHYSICS, 2005, 45 (5-6) : 441 - 449
  • [48] On the structure of low-dimensional Leibniz algebras: some revision
    Kurdachenko, L. A.
    Pypka, O. O.
    Subbotin, I. Ya.
    ALGEBRA AND DISCRETE MATHEMATICS, 2022, 34 (01): : 68 - 104
  • [49] RELATIONSHIPS BETWEEN STRUCTURE AND LOW-DIMENSIONAL MAGNETISM IN FLUORIDES
    TRESSAUD, A
    DANCE, JM
    STRUCTURE AND BONDING, 1982, 52 : 87 - 146
  • [50] Visual Exploration of Relationships and Structure in Low-Dimensional Embeddings
    Eckelt, Klaus
    Hinterreiter, Andreas
    Adelberger, Patrick
    Walchshofer, Conny
    Dhanoa, Vaishali
    Humer, Christina
    Heckmann, Moritz
    Steinparz, Christian
    Streit, Marc
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2023, 29 (07) : 3312 - 3326