Improved Regret Bounds for Online Kernel Selection Under Bandit Feedback

被引:0
|
作者
Li, Junfan [1 ]
Liao, Shizhong [1 ]
机构
[1] Tianjin Univ, Coll Intelligence & Comp, Tianjin 300350, Peoples R China
基金
中国国家自然科学基金;
关键词
Model selection; Online learning; Bandit; Kernel method;
D O I
10.1007/978-3-031-26412-2_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we improve the regret bound for online kernel selection under bandit feedback. Previous algorithm enjoys a O((parallel to f parallel to(2)(Hi) + 1)K-1/3 T-2/3) expected bound for Lipschitz loss functions. We prove two types of regret bounds improving the previous bound. For smooth loss functions, we propose an algorithm with a O(U-2/3 K-1/3 (Sigma(K)(i=1) L-T (f(i)(*)))(2/3)) expected bound where L-T (f(i)(*)) is the cumulative losses of optimal hypothesis in H-i = {f is an element of H-i : parallel to f parallel to(Hi) <= U}. The data-dependent bound keeps the previous worst-case bound and is smaller if most of candidate kernels match well with the data. For Lipschitz loss functions, we propose an algorithm with a O(U root KT ln(2/3)T) expected bound asymptotically improving the previous bound. We apply the two algorithms to online kernel selection with time constraint and prove new regret bounds matching or improving the previous O(root TlnK + parallel to f parallel to(Hi) max{root T, T/root R}) expected bound where R is the time budget. Finally, we empirically verify our algorithms on online regression and classification tasks.
引用
收藏
页码:333 / 348
页数:16
相关论文
共 50 条
  • [1] Regret Bounds for Online Kernel Selection in Continuous Kernel Space
    Zhang, Xiao
    Liao, Shizhong
    Xu, Jun
    Wen, Ji-Rong
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10931 - 10938
  • [2] High-Probability Kernel Alignment Regret Bounds for Online Kernel Selection
    Liao, Shizhong
    Li, Junfan
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, 2021, 12975 : 67 - 83
  • [3] Improved Regret Bounds for Bandit Combinatorial Optimization
    Ito, Shinji
    Hatano, Daisuke
    Sumita, Hanna
    Takemura, Kei
    Fukunaga, Takuro
    Kakimura, Naonori
    Kawarabayashi, Ken-ichi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [4] Online Kernel Selection with Local Regret
    Zhang X.
    Liao S.-Z.
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2019, 42 (01): : 61 - 72
  • [5] Improved algorithms for bandit with graph feedback via regret decomposition
    He, Yuchen
    Zhang, Chihao
    [J]. THEORETICAL COMPUTER SCIENCE, 2023, 979
  • [6] Improved Regret Bounds for Projection-free Bandit Convex Optimization
    Garber, Dan
    Kretzu, Ben
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 2196 - 2205
  • [7] Regret Bounds for Online Portfolio Selection with a Cardinality Constraint
    Ito, Shinji
    Hatano, Daisuke
    Sumita, Hanna
    Yabe, Akihiro
    Fukunaga, Takuro
    Kakimura, Naonori
    Kawarabayashi, Ken-ichi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [8] Online (Multinomial) Logistic Bandit: Improved Regret and Constant Computation Cost
    Zhang, Yu-Jie
    Sugiyama, Masashi
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [9] Multiclass Online Learnability under Bandit Feedback
    Raman, Ananth
    Raman, Vinod
    Subedi, Unique
    Mehalel, Idan
    Tewari, Ambuj
    [J]. INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 237, 2024, 237
  • [10] Online Kernel Selection via Grouped Adversarial Bandit Model
    Li, Junfan
    Liao, Shizhong
    [J]. 2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019), 2019, : 682 - 689