Online Learning with Local Permutations and Delayed Feedback

被引：0

作者：

Shamir, Ohad ^{[1
]}

Szlak, Liran ^{[1
]}

机构：

[1] Weizmann Inst Sci, Rehovot, Israel

来源：

INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70 | 2017年 / 70卷

基金：

以色列科学基金会;

关键词：

ALGORITHMS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose an Online Learning with Local Permutations (OLLP) setting, in which the learner is allowed to slightly permute the order of the loss functions generated by an adversary. On one hand, this models natural situations where the exact order of the learner's responses is not crucial, and on the other hand, might allow better learning and regret performance, by mitigating highly adversarial loss sequences. Also, with random permutations, this can be seen as a setting interpolating between adversarial and stochastic losses. In this paper, we consider the applicability of this setting to convex online learning with delayed feedback, in which the feedback on the prediction made in round t arrives with some delay tau. With such delayed feedback, the best possible regret bound is well-known to be O(root tau T). We prove that by being able to permute losses by a distance of at most M (for M >= tau), the regret can be improved to O (root T(1 + root tau(2)/M)), using a Mirror-Descent based algorithm which can be applied for both Euclidean and non-Euclidean geometries. We also prove a lower bound, showing that for M < tau/3, it is impossible to improve the standard O(root tau T) regret bound by more than constant factors. Finally, we provide some experiments validating the performance of our algorithm.

引用

页数：9

共 50 条

[1] Local online learning in recurrent networks with random feedback
Murray, James M.
ELIFE, 2019, 8
[2] Online EXP3 Learning in Adversarial Bandits with Delayed Feedback
Bistritz, Ilai
Zhou, Zhengyuan
Chen, Xi
Bambos, Nicholas
Blanchet, Jose
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[3] Learning with Delayed Feedback
Pranavan, Theivendiram
Sim, Terence
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4895 - 4902
[4] EFFECTS OF DELAYED INFORMATION FEEDBACK AND FEEDBACK CUES IN LEARNING ON DELAYED RETENTION
SASSENRA.JM
YONGE, GD
JOURNAL OF EDUCATIONAL PSYCHOLOGY, 1969, 60 (03) : 174 - &
[5] Distributed Online Composite Optimization with Delayed Feedback
Hou, Ruijie
Li, Xiuxian
Gao, Shangce
2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 43 - 47
[6] ONLINE LEARNING WITH PROBABILISTIC FEEDBACK
Ghari, Pouya M.
Shen, Yanning
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4183 - 4187
[7] Planning and learning in environments with delayed feedback
Walsh, Thomas J.
Nouri, Ali
Li, Lihong
Littman, Michael L.
MACHINE LEARNING: ECML 2007, PROCEEDINGS, 2007, 4701 : 442 - +
[8] LEARNING FROM DELAYED FEEDBACK IN ADOLESCENCE
Davidow, Juliet Y.
Foerde, Karin
Galvan, Adriana
Shohamy, Daphna
JOURNAL OF COGNITIVE NEUROSCIENCE, 2013, : 167 - 167
[9] Learning and planning in environments with delayed feedback
Thomas J. Walsh
Ali Nouri
Lihong Li
Michael L. Littman
Autonomous Agents and Multi-Agent Systems, 2009, 18 : 83 - 105
[10] Learning and planning in environments with delayed feedback
Walsh, Thomas J.
Nouri, Ali
Li, Lihong
Littman, Michael L.
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2009, 18 (01) : 83 - 105

← 1 2 3 4 5 →