Efficient Kernel UCB for Contextual Bandits

被引:0
|
作者
Zenati, Houssam [1 ,2 ]
Bietti, Alberto [3 ]
Diemert, Eustache [1 ]
Mairal, Julien [2 ]
Martin, Matthieu [1 ]
Gaillard, Pierre [2 ]
机构
[1] Criteo AI Lab, Ann Arbor, MI 48104 USA
[2] INRIA, Grenoble, France
[3] NYU, Ctr Data Sci, New York, NY 10003 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a O(CT3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nystrom approximations of the joint kernel embedding of contexts and actions. This allows us to achieve a complexity of O(CTm2) where m is the number of Nystrom points. To recover the same regret as the standard kernelized UCB algorithm, m needs to be of order of the effective dimension of the problem, which is at most O(root T) and nearly constant in some cases.
引用
收藏
页码:5689 / 5720
页数:32
相关论文
共 50 条
  • [21] Efficient Beam Alignment in Millimeter Wave Systems Using Contextual Bandits
    Hashemi, Morteza
    Sabharwal, Ashutosh
    Koksal, C. Emre
    Shroff, Ness B.
    IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2018), 2018, : 2393 - 2401
  • [22] A Simple and Provably Efficient Algorithm for Asynchronous Federated Contextual Linear Bandits
    He, Jiafan
    Wang, Tianhao
    Min, Yifei
    Gu, Quanquan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [23] Nonstationary Stochastic Bandits: UCB Policies and Minimax Regret
    Wei, Lai
    Srivastava, Vaibhav
    IEEE OPEN JOURNAL OF CONTROL SYSTEMS, 2024, 3 : 128 - 142
  • [24] Model selection for contextual bandits
    Foster, Dylan J.
    Krishnamurthy, Akshay
    Luo, Haipeng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [25] Conservative Contextual Linear Bandits
    Kazerouni, Abbas
    Ghavamzadeh, Mohammad
    Abbasi-Yadkori, Yasin
    Van Roy, Benjamin
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [26] Contextual bandits with similarity information
    Slivkins, A. (slivkins@microsoft.com), 1600, Microtome Publishing (15):
  • [27] Contextual Bandits with Similarity Information
    Slivkins, Aleksandrs
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 2533 - 2568
  • [28] Expected Improvement for Contextual Bandits
    Hung Tran-The
    Gupta, Sunil
    Sana, Santu
    Tuan Truong
    Tran-Thanh, Long
    Venkatesh, Svetha
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [29] Nonparametric Stochastic Contextual Bandits
    Guan, Melody Y.
    Jiang, Heinrich
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3119 - 3125
  • [30] Balanced Linear Contextual Bandits
    Dimakopoulou, Maria
    Zhou, Zhengyuan
    Athey, Susan
    Imbens, Guido
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3445 - 3453