MetaGrad: Adaptation using Multiple Learning Rates in Online Learning

被引：0

作者：

van Erven, Tim ^{[1
]}

Koolen, Wouter M. ^{[2
]}

van der Hoeven, Dirk ^{[3
]}

机构：

[1] Univ Amsterdam, Korteweg Vries Inst Math, Sci Pk 107, NL-1098 XG Amsterdam, Netherlands

[2] Ctr Wiskunde & Informat, Sci Pk 123, NL-1098 XG Amsterdam, Netherlands

[3] Leiden Univ, Math Inst, Niels Bohrweg 1, NL-2300 RA Leiden, Netherlands

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2021年 / 22卷

关键词：

online convex optimization; adaptivity; FREQUENT DIRECTIONS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We provide a new adaptive method for online convex optimization, MetaGrad, that is robust to general convex losses but achieves faster rates for a broad class of special functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. We prove this by drawing a connection to the Bernstein condition, which is known to imply fast rates in offline statistical learning. MetaGrad further adapts automatically to the size of the gradients. Its main feature is that it simultaneously considers multiple learning rates, which are weighted directly proportional to their empirical performance on the data using a new meta-algorithm. We provide three versions of MetaGrad. The full matrix version maintains a full covariance matrix and is applicable to learning tasks for which we can afford update time quadratic in the dimension. The other two versions provide speed-ups for high-dimensional learning tasks with an update time that is linear in the dimension: one is based on sketching, the other on running a separate copy of the basic algorithm per coordinate. We evaluate all versions of MetaGrad on benchmark online classification and regression tasks, on which they consistently outperform both online gradient descent and AdaGrad.

引用

页数：61

共 50 条

[31] Efficient online surface defect detection using multiple instance learning
Xu, Guang
Ren, Ming
Li, Guozhi
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 252
[32] FRAGMENT-BASED TRACKING USING ONLINE MULTIPLE KERNEL LEARNING
Jia, Xu
Wang, Dong
Lu, Huchuan
2012 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2012), 2012, : 393 - 396
[33] EFFICIENT ONLINE LEARNING WITH INDIVIDUAL LEARNING-RATES FOR PHONEME SEQUENCE RECOGNITION
Crammer, Koby
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4878 - 4881
[34] A Model of Adaptation in Online Learning Environments (LMSs and MOOCs)
Cherkaoui, C.
Qazdar, A.
Battou, A.
Mezouary, A.
Bakki, A.
Mamass, D.
Er-Raha, B.
2015 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS: THEORIES AND APPLICATIONS (SITA), 2015,
[35] Online reinforcement learning of controller parameters adaptation law
Alhazmi, Khalid
Sarathy, S. Mani
2023 AMERICAN CONTROL CONFERENCE, ACC, 2023, : 2975 - 2980
[36] Efficient Online Hyperparameter Adaptation for Deep Reinforcement Learning
Zhou, Yinda
Liu, Weiming
Li, Bin
APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2019, 2019, 11454 : 141 - 155
[37] Evolutionary online behaviour learning and adaptation in real robots
Silva, Fernando
Correia, Luis
Christensen, Anders Lyhne
ROYAL SOCIETY OPEN SCIENCE, 2017, 4 (07):
[38] Online adaptation of kernel learning adaptive predictive controller
Liu, Yi
Yu, Hai-Qing
Gao, Zeng-Liang
Wang, Hai-Qing
Li, Ping
Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2011, 28 (09): : 1099 - 1104
[39] Meta-Learning Online Adaptation of Language Models
Hu, Nathan
Mitchell, Eric
Manning, Christopher D.
Finn, Chelsea
2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 4418 - 4432
[40] Online classifier adaptation for cost-sensitive learning
Junlin Zhang
José García
Neural Computing and Applications, 2016, 27 : 781 - 789

← 1 2 3 4 5 →