Efficient Kernel UCB for Contextual Bandits

被引：0

作者：

Zenati, Houssam ^{[1
,2
]}

Bietti, Alberto ^{[3
]}

Diemert, Eustache ^{[1
]}

Mairal, Julien ^{[2
]}

Martin, Matthieu ^{[1
]}

Gaillard, Pierre ^{[2
]}

机构：

[1] Criteo AI Lab, Ann Arbor, MI 48104 USA

[2] INRIA, Grenoble, France

[3] NYU, Ctr Data Sci, New York, NY 10003 USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151 | 2022年 / 151卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a O(CT3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nystrom approximations of the joint kernel embedding of contexts and actions. This allows us to achieve a complexity of O(CTm2) where m is the number of Nystrom points. To recover the same regret as the standard kernelized UCB algorithm, m needs to be of order of the effective dimension of the problem, which is at most O(root T) and nearly constant in some cases.

引用

页码：5689 / 5720

页数：32

共 50 条

[21] Efficient Beam Alignment in Millimeter Wave Systems Using Contextual Bandits
Hashemi, Morteza
Sabharwal, Ashutosh
Koksal, C. Emre
Shroff, Ness B.
IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2018), 2018, : 2393 - 2401
[22] A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits
He, Jiafan
Wang, Tianhao
Min, Yifei
Gu, Quanquan
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[23] Nonstationary Stochastic Bandits: UCB Policies and Minimax Regret
Wei, Lai
Srivastava, Vaibhav
IEEE OPEN JOURNAL OF CONTROL SYSTEMS, 2024, 3 : 128 - 142
[24] Model selection for contextual bandits
Foster, Dylan J.
Krishnamurthy, Akshay
Luo, Haipeng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[25] Conservative Contextual Linear Bandits
Kazerouni, Abbas
Ghavamzadeh, Mohammad
Abbasi-Yadkori, Yasin
Van Roy, Benjamin
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
[26] Contextual bandits with similarity information
Slivkins, A. (slivkins@microsoft.com), 1600, Microtome Publishing (15):
[27] Contextual Bandits with Similarity Information
Slivkins, Aleksandrs
JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 2533 - 2568
[28] Expected Improvement for Contextual Bandits
Hung Tran-The
Gupta, Sunil
Sana, Santu
Tuan Truong
Tran-Thanh, Long
Venkatesh, Svetha
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[29] Nonparametric Stochastic Contextual Bandits
Guan, Melody Y.
Jiang, Heinrich
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3119 - 3125
[30] Balanced Linear Contextual Bandits
Dimakopoulou, Maria
Zhou, Zhengyuan
Athey, Susan
Imbens, Guido
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3445 - 3453

← 1 2 3 4 5 →