MetaGrad: Adaptation using Multiple Learning Rates in Online Learning

被引:0
|
作者
van Erven, Tim [1 ]
Koolen, Wouter M. [2 ]
van der Hoeven, Dirk [3 ]
机构
[1] Univ Amsterdam, Korteweg Vries Inst Math, Sci Pk 107, NL-1098 XG Amsterdam, Netherlands
[2] Ctr Wiskunde & Informat, Sci Pk 123, NL-1098 XG Amsterdam, Netherlands
[3] Leiden Univ, Math Inst, Niels Bohrweg 1, NL-2300 RA Leiden, Netherlands
关键词
online convex optimization; adaptivity; FREQUENT DIRECTIONS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We provide a new adaptive method for online convex optimization, MetaGrad, that is robust to general convex losses but achieves faster rates for a broad class of special functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. We prove this by drawing a connection to the Bernstein condition, which is known to imply fast rates in offline statistical learning. MetaGrad further adapts automatically to the size of the gradients. Its main feature is that it simultaneously considers multiple learning rates, which are weighted directly proportional to their empirical performance on the data using a new meta-algorithm. We provide three versions of MetaGrad. The full matrix version maintains a full covariance matrix and is applicable to learning tasks for which we can afford update time quadratic in the dimension. The other two versions provide speed-ups for high-dimensional learning tasks with an update time that is linear in the dimension: one is based on sketching, the other on running a separate copy of the basic algorithm per coordinate. We evaluate all versions of MetaGrad on benchmark online classification and regression tasks, on which they consistently outperform both online gradient descent and AdaGrad.
引用
收藏
页数:61
相关论文
共 50 条
  • [1] Metagrad: Adaptation using multiple learning rates in online learning
    Van Erven, Tim
    Koolen, Wouter M.
    Van Der Hoeven, Dirk
    1600, Microtome Publishing (22):
  • [2] MetaGrad: Multiple Learning Rates in Online Learning
    van Erven, Tim
    Koolen, Wouter M.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [3] Lipschitz Adaptivity with Multiple Learning Rates in Online Learning
    Mhammedi, Zakaria
    Koolen, Wouter M.
    van Erven, Tim
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [4] Tracking of Multiple Targets Using Online Learning for Reference Model Adaptation
    Pernkopf, Franz
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2008, 38 (06): : 1465 - 1475
  • [5] Online Learning of a Memory for Learning Rates
    Meier, Franziska
    Kappler, Daniel
    Schaal, Stefan
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2018, : 2425 - 2432
  • [6] Learning Path Adaptation in Online Learning Systems
    Muhammad, Alva
    Zhou, Qingguo
    Beydoun, Ghassan
    Xu, Dongming
    Shen, Jun
    2016 IEEE 20TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD), 2016, : 421 - 426
  • [7] Adaptation in Online Social Learning
    Bordignon, Virginia
    Matta, Vincenzo
    Sayed, Ali H.
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 2170 - 2174
  • [8] Online learning and recidivism rates
    Sellers, Martin P.
    INTERNATIONAL JOURNAL OF LEADERSHIP IN EDUCATION, 2016, 19 (05) : 632 - 636
  • [9] ADAPTATION AND LEARNING USING MULTIPLE MODELS, SWITCHING, AND TUNING
    NARENDRA, KS
    BALAKRISHNAN, J
    CILIZ, MK
    IEEE CONTROL SYSTEMS MAGAZINE, 1995, 15 (03): : 37 - 51
  • [10] Online Learning Using Multiple Times Weight Updating
    Singh, Charanjeet
    Sharma, Anuj
    APPLIED ARTIFICIAL INTELLIGENCE, 2020, 34 (06) : 515 - 536