Weighted Gaussian Process Bandits for Non-stationary Environments

被引：0

作者：

Deng, Yuntian ^{[1
]}

Zhou, Xingyu ^{[2
]}

Kim, Baekjin ^{[3
]}

Tewari, Ambuj ^{[3
]}

Gupta, Abhishek ^{[1
]}

Shroff, Ness ^{[1
]}

机构：

[1] Ohio State Univ, Columbus, OH 43210 USA

[2] Wayne State Univ, Detroit, MI 48202 USA

[3] Univ Michigan, Ann Arbor, MI 48109 USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151 | 2022年 / 151卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we consider the Gaussian process (GP) bandit optimization problem in a non-stationary environment. To capture external changes, the black-box function is allowed to be time-varying within a reproducing kernel Hilbert space (RKHS). To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression. A key challenge is how to cope with infinite-dimensional feature maps. To that end, we leverage kernel approximation techniques to prove a sublinear regret bound, which is the first (frequentist) sublinear regret guarantee on weighted time-varying bandits with general nonlinear rewards. This result generalizes both non-stationary linear bandits and standard GP-UCB algorithms. Further, a novel concentration inequality is achieved for weighted Gaussian process regression with general weights. We also provide universal upper bounds and weight-dependent upper bounds for weighted maximum information gains. These results are of independent interest for applications such as news ranking and adaptive pricing, where weights can be adopted to capture the importance or quality of data. Finally, we conduct experiments to highlight the favorable gains of the proposed algorithm in many cases when compared to existing methods.

引用

页数：24

共 50 条

[11] Learning Contextual Bandits in a Non-stationary Environment
Wu, Qingyun
Iyer, Naveen
Wang, Hongning
ACM/SIGIR PROCEEDINGS 2018, 2018, : 495 - 504
[12] Competing Bandits in Non-Stationary Matching Markets
Ghosh, Avishek
Sankararaman, Abishek
Ramchandran, Kannan
Javidi, Tara
Mazumdar, Arya
IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (04) : 2831 - 2850
[13] Scalable Gaussian Process Separation for Kernels with a Non-Stationary Phase
Grasshoff, Jan
Jankowski, Alexandra
Rostalski, Philipp
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[14] Scalable Gaussian Process Separation for Kernels with a Non-Stationary Phase
Grasshoff, Jan
Jankowski, Alexandra
Rostalski, Philipp
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[15] Non-Stationary Gaussian Process Regression with Hamiltonian Monte Carlo
Heinonen, Markus
Mannerstrom, Henrik
Rousu, Juho
Kaski, Samuel
Lahdesmaki, Harri
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 51, 2016, 51 : 732 - 740
[16] Time-Decaying Bandits for Non-stationary Systems
Komiyama, Junpei, 1600, Springer Verlag (8877):
[17] Time-Decaying Bandits for Non-stationary Systems
Komiyama, Junpei
Qin, Tao
WEB AND INTERNET ECONOMICS, 2014, 8877 : 460 - 466
[18] Beam Alignment for mmWave Using Non-Stationary Bandits
Gupta, Ruchir
Lakshmanan, K.
Sah, Abhay Kumar
IEEE COMMUNICATIONS LETTERS, 2020, 24 (11) : 2619 - 2622
[19] Non-Stationary Representation Learning in Sequential Linear Bandits
Qin, Yuzhen
Menara, Tommaso
Oymak, Samet
Ching, Shinung
Pasqualetti, Fabio
IEEE OPEN JOURNAL OF CONTROL SYSTEMS, 2022, 1 : 41 - 56
[20] Randomized Exploration for Non-Stationary Stochastic Linear Bandits
Kim, Baekjin
Tewari, Ambuj
CONFERENCE ON UNCERTAINTY IN ARTIFICIAL INTELLIGENCE (UAI 2020), 2020, 124 : 71 - 80

← 1 2 3 4 5 →