Time-Decaying Bandits for Non-stationary Systems

被引:5
|
作者
机构
[1] Komiyama, Junpei
[2] Qin, Tao
来源
Komiyama, Junpei | 1600年 / Springer Verlag卷 / 8877期
关键词
Stochastic systems;
D O I
10.1007/978-3-319-13129-0_40
中图分类号
学科分类号
摘要
Contents displayed on web portals (e.g., news articles at Yahoo.com) are usually adaptively selected from a dynamic set of candidate items, and the attractiveness of each item decays over time. The goal of those websites is to maximize the engagement of users (usually measured by their clicks) on the selected items.We formulate this kind of applications as a new variant of bandit problems where new arms are dynamically added into the candidate set and the expected reward of each arm decays as the round proceeds. For this new problem, a direct application of the algorithms designed for stochastic MAB (e.g., UCB) will lead to over-estimation of the rewards of old arms, and thus cause a misidentification of the optimal arm. To tackle this challenge, we propose a new algorithm that can adaptively estimate the temporal dynamics in the rewards of the arms, and effectively identify the best arm at a given time point on this basis. When the temporal dynamics are represented by a set of features, the proposed algorithm is able to enjoy a sub-linear regret. Our experiments verify the effectiveness of the proposed algorithm. © Springer International Publishing Switzerland 2014.
引用
收藏
相关论文
共 50 条
  • [41] Non-Stationary Abstract Friedrichs Systems
    Burazin, Kresimir
    Erceg, Marko
    MEDITERRANEAN JOURNAL OF MATHEMATICS, 2016, 13 (06) : 3777 - 3796
  • [42] Stabilization of non-stationary systems with delay
    Grebenshchikov, B. G.
    JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL, 2010, 49 (02) : 178 - 185
  • [43] Contextual Multi-Armed Bandits for Non-Stationary Wireless Network Selection
    Martinez, Lluis
    Vidal, Josep
    Cabrera-Bean, Margarita
    IEEE CONFERENCE ON GLOBAL COMMUNICATIONS, GLOBECOM, 2023, : 285 - 290
  • [44] Non-Stationary Bandits under Recharging Payoffs: Improved Planning with Sublinear Regret
    Papadigenopoulos, Orestis
    Caramanis, Constantine
    Shakkottai, Sanjay
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [45] A Technical Note on Non-Stationary Parametric Bandits: Existing Mistakes and Preliminary Solutions
    Faury, Louis
    Russac, Yoan
    Abeille, Marc
    Calauzenes, Clement
    ALGORITHMIC LEARNING THEORY, VOL 132, 2021, 132
  • [46] Some algorithms for correlated bandits with non-stationary rewards : Regret bounds and applications
    Mayekar, Prathamesh
    Hemachandra, Nandyala
    PROCEEDINGS OF THE THIRD ACM IKDD CONFERENCE ON DATA SCIENCES (CODS), 2016,
  • [47] Stabilization of non-stationary systems with delay
    B. G. Grebenshchikov
    Journal of Computer and Systems Sciences International, 2010, 49 : 178 - 185
  • [48] The stability of equilibrium of non-stationary systems
    Aleksandrov, AY
    PMM JOURNAL OF APPLIED MATHEMATICS AND MECHANICS, 1996, 60 (02): : 199 - 203
  • [49] CAMOP: Quantum Non-Stationary Systems
    Dodonov, Victor V.
    Man'ko, Margarita A.
    PHYSICA SCRIPTA, 2010, 82 (03)
  • [50] Linear time-dependent invariants of non-stationary quantum systems
    Castaños, O
    López-Peña, R
    PARTICLES AND FIELDS, 2003, 670 : 168 - 175