MetaGrad: Adaptation using Multiple Learning Rates in Online Learning

被引:0
|
作者
van Erven, Tim [1 ]
Koolen, Wouter M. [2 ]
van der Hoeven, Dirk [3 ]
机构
[1] Univ Amsterdam, Korteweg Vries Inst Math, Sci Pk 107, NL-1098 XG Amsterdam, Netherlands
[2] Ctr Wiskunde & Informat, Sci Pk 123, NL-1098 XG Amsterdam, Netherlands
[3] Leiden Univ, Math Inst, Niels Bohrweg 1, NL-2300 RA Leiden, Netherlands
关键词
online convex optimization; adaptivity; FREQUENT DIRECTIONS;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We provide a new adaptive method for online convex optimization, MetaGrad, that is robust to general convex losses but achieves faster rates for a broad class of special functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. We prove this by drawing a connection to the Bernstein condition, which is known to imply fast rates in offline statistical learning. MetaGrad further adapts automatically to the size of the gradients. Its main feature is that it simultaneously considers multiple learning rates, which are weighted directly proportional to their empirical performance on the data using a new meta-algorithm. We provide three versions of MetaGrad. The full matrix version maintains a full covariance matrix and is applicable to learning tasks for which we can afford update time quadratic in the dimension. The other two versions provide speed-ups for high-dimensional learning tasks with an update time that is linear in the dimension: one is based on sketching, the other on running a separate copy of the basic algorithm per coordinate. We evaluate all versions of MetaGrad on benchmark online classification and regression tasks, on which they consistently outperform both online gradient descent and AdaGrad.
引用
收藏
页数:61
相关论文
共 50 条
  • [31] Efficient online surface defect detection using multiple instance learning
    Xu, Guang
    Ren, Ming
    Li, Guozhi
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 252
  • [32] FRAGMENT-BASED TRACKING USING ONLINE MULTIPLE KERNEL LEARNING
    Jia, Xu
    Wang, Dong
    Lu, Huchuan
    2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 393 - 396
  • [33] EFFICIENT ONLINE LEARNING WITH INDIVIDUAL LEARNING-RATES FOR PHONEME SEQUENCE RECOGNITION
    Crammer, Koby
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4878 - 4881
  • [34] A Model of Adaptation in Online Learning Environments (LMSs and MOOCs)
    Cherkaoui, C.
    Qazdar, A.
    Battou, A.
    Mezouary, A.
    Bakki, A.
    Mamass, D.
    Er-Raha, B.
    2015 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2015,
  • [35] Online reinforcement learning of controller parameters adaptation law
    Alhazmi, Khalid
    Sarathy, S. Mani
    2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 2975 - 2980
  • [36] Efficient Online Hyperparameter Adaptation for Deep Reinforcement Learning
    Zhou, Yinda
    Liu, Weiming
    Li, Bin
    APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2019, 2019, 11454 : 141 - 155
  • [37] Evolutionary online behaviour learning and adaptation in real robots
    Silva, Fernando
    Correia, Luis
    Christensen, Anders Lyhne
    ROYAL SOCIETY OPEN SCIENCE, 2017, 4 (07):
  • [38] Online adaptation of kernel learning adaptive predictive controller
    Liu, Yi
    Yu, Hai-Qing
    Gao, Zeng-Liang
    Wang, Hai-Qing
    Li, Ping
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2011, 28 (09): : 1099 - 1104
  • [39] Meta-Learning Online Adaptation of Language Models
    Hu, Nathan
    Mitchell, Eric
    Manning, Christopher D.
    Finn, Chelsea
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 4418 - 4432
  • [40] Online classifier adaptation for cost-sensitive learning
    Junlin Zhang
    José García
    Neural Computing and Applications, 2016, 27 : 781 - 789