Weighted Gaussian Process Bandits for Non-stationary Environments

被引:0
|
作者
Deng, Yuntian [1 ]
Zhou, Xingyu [2 ]
Kim, Baekjin [3 ]
Tewari, Ambuj [3 ]
Gupta, Abhishek [1 ]
Shroff, Ness [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
[2] Wayne State Univ, Detroit, MI 48202 USA
[3] Univ Michigan, Ann Arbor, MI 48109 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we consider the Gaussian process (GP) bandit optimization problem in a non-stationary environment. To capture external changes, the black-box function is allowed to be time-varying within a reproducing kernel Hilbert space (RKHS). To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression. A key challenge is how to cope with infinite-dimensional feature maps. To that end, we leverage kernel approximation techniques to prove a sublinear regret bound, which is the first (frequentist) sublinear regret guarantee on weighted time-varying bandits with general nonlinear rewards. This result generalizes both non-stationary linear bandits and standard GP-UCB algorithms. Further, a novel concentration inequality is achieved for weighted Gaussian process regression with general weights. We also provide universal upper bounds and weight-dependent upper bounds for weighted maximum information gains. These results are of independent interest for applications such as news ranking and adaptive pricing, where weights can be adopted to capture the importance or quality of data. Finally, we conduct experiments to highlight the favorable gains of the proposed algorithm in many cases when compared to existing methods.
引用
收藏
页数:24
相关论文
共 50 条
  • [31] Rewiring Neurons in Non-Stationary Environments
    Sun, Zhicheng
    Mu, Yadong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [32] Social Learning in non-stationary environments
    Boursier, Etienne
    Perchet, Vianney
    Scarsini, Marco
    INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 167, 2022, 167
  • [33] A New Look at Dynamic Regret for Non-Stationary Stochastic Bandits
    Abbasi-Yadkori, Yasin
    Gyorgy, Andraes
    Lazic, Nevena
    JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [34] Stochastic Bandits With Non-Stationary Rewards: Reward Attack and Defense
    Yang, Chenye
    Liu, Guanlin
    Lai, Lifeng
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 5007 - 5020
  • [35] FLOODING RISK ASSESSMENT IN STATIONARY AND NON-STATIONARY ENVIRONMENTS
    Thomson, Rhys
    Drynan, Leo
    Ball, James
    Veldema, Ailsa
    Phillips, Brett
    Babister, Mark
    PROCEEDINGS OF THE 36TH IAHR WORLD CONGRESS: DELTAS OF THE FUTURE AND WHAT HAPPENS UPSTREAM, 2015, : 5167 - 5177
  • [36] A Perturbed Gaussian Process Regression with Chunk Sparsification for Tracking Non-stationary Systems
    Li, Dong
    Zhao, Dongbin
    Xia, Zhongpu
    PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC), 2016, : 6639 - 6644
  • [37] DEEP GAUSSIAN PROCESS METAMODELING OF SEQUENTIALLY SAMPLED NON-STATIONARY RESPONSE SURFACES
    Dutordoir, Vincent
    Knudde, Nicolas
    van der Herten, Joachim
    Couckuyt, Ivo
    Dhaene, Tom
    2017 WINTER SIMULATION CONFERENCE (WSC), 2017, : 1728 - 1739
  • [38] Non-stationary Gaussian process regression applied in validation of vehicle dynamics models
    Rhode, Stephan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2020, 93
  • [39] Prediction of non-stationary response functions using a Bayesian composite Gaussian process
    Davis, Casey B.
    Hans, Christopher M.
    Santner, Thomas J.
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2021, 154
  • [40] NUMERICAL SIMULATION OF STATIONARY AND NON-STATIONARY GAUSSIAN RANDOM PROCESSES
    FRANKLIN, JN
    SIAM REVIEW, 1965, 7 (01) : 68 - &