Online Learning with Local Permutations and Delayed Feedback

被引:0
|
作者
Shamir, Ohad [1 ]
Szlak, Liran [1 ]
机构
[1] Weizmann Inst Sci, Rehovot, Israel
基金
以色列科学基金会;
关键词
ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose an Online Learning with Local Permutations (OLLP) setting, in which the learner is allowed to slightly permute the order of the loss functions generated by an adversary. On one hand, this models natural situations where the exact order of the learner's responses is not crucial, and on the other hand, might allow better learning and regret performance, by mitigating highly adversarial loss sequences. Also, with random permutations, this can be seen as a setting interpolating between adversarial and stochastic losses. In this paper, we consider the applicability of this setting to convex online learning with delayed feedback, in which the feedback on the prediction made in round t arrives with some delay tau. With such delayed feedback, the best possible regret bound is well-known to be O(root tau T). We prove that by being able to permute losses by a distance of at most M (for M >= tau), the regret can be improved to O (root T(1 + root tau(2)/M)), using a Mirror-Descent based algorithm which can be applied for both Euclidean and non-Euclidean geometries. We also prove a lower bound, showing that for M < tau/3, it is impossible to improve the standard O(root tau T) regret bound by more than constant factors. Finally, we provide some experiments validating the performance of our algorithm.
引用
收藏
页数:9
相关论文
共 50 条
  • [1] Local online learning in recurrent networks with random feedback
    Murray, James M.
    ELIFE, 2019, 8
  • [2] Online EXP3 Learning in Adversarial Bandits with Delayed Feedback
    Bistritz, Ilai
    Zhou, Zhengyuan
    Chen, Xi
    Bambos, Nicholas
    Blanchet, Jose
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [3] Learning with Delayed Feedback
    Pranavan, Theivendiram
    Sim, Terence
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 4895 - 4902
  • [4] EFFECTS OF DELAYED INFORMATION FEEDBACK AND FEEDBACK CUES IN LEARNING ON DELAYED RETENTION
    SASSENRA.JM
    YONGE, GD
    JOURNAL OF EDUCATIONAL PSYCHOLOGY, 1969, 60 (03) : 174 - &
  • [5] Distributed Online Composite Optimization with Delayed Feedback
    Hou, Ruijie
    Li, Xiuxian
    Gao, Shangce
    2024 14TH ASIAN CONTROL CONFERENCE, ASCC 2024, 2024, : 43 - 47
  • [6] ONLINE LEARNING WITH PROBABILISTIC FEEDBACK
    Ghari, Pouya M.
    Shen, Yanning
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4183 - 4187
  • [7] Planning and learning in environments with delayed feedback
    Walsh, Thomas J.
    Nouri, Ali
    Li, Lihong
    Littman, Michael L.
    MACHINE LEARNING: ECML 2007, PROCEEDINGS, 2007, 4701 : 442 - +
  • [8] LEARNING FROM DELAYED FEEDBACK IN ADOLESCENCE
    Davidow, Juliet Y.
    Foerde, Karin
    Galvan, Adriana
    Shohamy, Daphna
    JOURNAL OF COGNITIVE NEUROSCIENCE, 2013, : 167 - 167
  • [9] Learning and planning in environments with delayed feedback
    Thomas J. Walsh
    Ali Nouri
    Lihong Li
    Michael L. Littman
    Autonomous Agents and Multi-Agent Systems, 2009, 18 : 83 - 105
  • [10] Learning and planning in environments with delayed feedback
    Walsh, Thomas J.
    Nouri, Ali
    Li, Lihong
    Littman, Michael L.
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2009, 18 (01) : 83 - 105