Time-Decaying Bandits for Non-stationary Systems

被引:5
|
作者
机构
[1] Komiyama, Junpei
[2] Qin, Tao
来源
Komiyama, Junpei | 1600年 / Springer Verlag卷 / 8877期
关键词
Stochastic systems;
D O I
10.1007/978-3-319-13129-0_40
中图分类号
学科分类号
摘要
Contents displayed on web portals (e.g., news articles at Yahoo.com) are usually adaptively selected from a dynamic set of candidate items, and the attractiveness of each item decays over time. The goal of those websites is to maximize the engagement of users (usually measured by their clicks) on the selected items.We formulate this kind of applications as a new variant of bandit problems where new arms are dynamically added into the candidate set and the expected reward of each arm decays as the round proceeds. For this new problem, a direct application of the algorithms designed for stochastic MAB (e.g., UCB) will lead to over-estimation of the rewards of old arms, and thus cause a misidentification of the optimal arm. To tackle this challenge, we propose a new algorithm that can adaptively estimate the temporal dynamics in the rewards of the arms, and effectively identify the best arm at a given time point on this basis. When the temporal dynamics are represented by a set of features, the proposed algorithm is able to enjoy a sub-linear regret. Our experiments verify the effectiveness of the proposed algorithm. © Springer International Publishing Switzerland 2014.
引用
收藏
相关论文
共 50 条
  • [21] A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits
    Abbasi-Yadkori, Yasin
    Gyorgy, Andraes
    Lazic, Nevena
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [22] Stochastic Bandits With Non-Stationary Rewards: Reward Attack and Defense
    Yang, Chenye
    Liu, Guanlin
    Lai, Lifeng
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 5007 - 5020
  • [23] Identification and control of non-stationary time delayed systems
    Dréano, P
    Laurent, R
    ROBUST CONTROL DESIGN 2000, VOLS 1 & 2, 2000, 1-2 : 267 - 271
  • [24] Maintaining time-decaying stream aggregates
    Cohen, E
    Strauss, MJ
    JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2006, 59 (01): : 19 - 36
  • [25] Scalable Time-Decaying Adaptive Prediction Algorithm
    Tan, Yinyan
    Fan, Zhe
    Li, Guilin
    Wang, Fangshan
    Li, Zhengbing
    Liu, Shikai
    Pan, Qiuling
    Xing, Eric P.
    Ho, Qirong
    KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 617 - 626
  • [26] Bayesian analysis of biexponential time-decaying signals
    Whittenburg, S.L.
    Spectrochimica Acta, Part A: Molecular and Biomolecular Spectroscopy, 1998, 54A (04): : 559 - 566
  • [27] Bayesian analysis of biexponential time-decaying signals
    Whittenburg, SL
    SPECTROCHIMICA ACTA PART A-MOLECULAR AND BIOMOLECULAR SPECTROSCOPY, 1998, 54 (04) : 559 - 566
  • [28] Optimal and Efficient Dynamic Regret Algorithms for Non-Stationary Dueling Bandits
    Saha, Aadirupa
    Gupta, Shubham
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 19027 - 19049
  • [29] Non-stationary Continuum-armed Bandits for Online Hyperparameter Optimization
    Lu, Shiyin
    Zhou, Yu-Hang
    Shi, Jing-Cheng
    Zhu, Wenya
    Yu, Qingtao
    Chen, Qing-Guo
    Da, Qing
    Zhang, Lijun
    WSDM'22: PROCEEDINGS OF THE FIFTEENTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, 2022, : 618 - 627
  • [30] Time-Decaying Sketches for Sensor Data Aggregation
    Cormode, Graham
    Tirthapura, Srikanta
    Xu, Bojian
    PODC'07: PROCEEDINGS OF THE 26TH ANNUAL ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2007, : 215 - 224