Sparse Linear Contextual Bandits via Relevance Vector Machines

被引：0

作者：

Gilton, Davis ^{[1
]}

Willett, Rebecca ^{[1
]}

机构：

[1] Univ Wisconsin, Elect & Comp Engn, Madison, WI 53706 USA

来源：

2017 INTERNATIONAL CONFERENCE ON SAMPLING THEORY AND APPLICATIONS (SAMPTA) | 2017年

关键词：

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper describes a linear multi-armed bandit algorithm that exploits sparsity in the underlying unknown weight vector controlling rewards. In linear multi-armed bandits, a user chooses a sequence of (slot machine) "arms" to pull, and each arm pull results in the user receiving a stochastic reward with mean equal to the inner product between a known feature vector associated with the arm and an unknown weight vector. While linear bandit algorithms have been widely considered in the literature, relatively little is known about how to exploit sparsity in the weight vector. This paper describes a novel approach that leverages ideas from linear Thompson sampling and relevance vector machines, resulting in a scalable approach that adapts to the unknown sparse support. Theoretical regret bounds highlight the proposed algorithm's performance as a function of the sparsity level, and simulations illustrate the advantages of the proposed method over several competing approaches.

引用

页码：518 / 522

页数：5

共 50 条

[21] Federated Linear Contextual Bandits with Heterogeneous Clients
Blaser, Ethan
Li, Chuanhao
Wang, Hongning
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 238, 2024, 238
[22] Group Meritocratic Fairness in Linear Contextual Bandits
Grazzi, Riccardo
Akhavan, Arya
Falk, John Isak Texas
Cella, Leonardo
Pontil, Massimiliano
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[23] Linear Contextual Bandits with Hybrid Payoff: Revisited
Das, Nirjhar
Sinha, Gaurav
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES-RESEARCH TRACK, PT VI, ECML PKDD 2024, 2024, 14946 : 441 - 455
[24] Smoothed Adversarial Linear Contextual Bandits with Knapsacks
Sivakumar, Vidyashankar
Zuo, Shiliang
Banerjee, Arindam
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
[25] Leveraging Good Representations in Linear Contextual Bandits
Papini, Matteo
Tirinzoni, Andrea
Restelli, Marcello
Lazaric, Alessandro
Pirotta, Matteo
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[26] The Minimum Redundancy - Maximum Relevance Approach to Building Sparse Support Vector Machines
Yang, Xiaoxing
Tang, Ke
Yao, Xin
INTELLIGENT DATA ENGINEERING AND AUTOMATED LEARNING, PROCEEDINGS, 2009, 5788 : 184 - 190
[27] Improving Relevance Feedback via Using Support Vector Machines
Chen, Zilong
Lu, Yang
ADVANCES IN CIVIL ENGINEERING, PTS 1-6, 2011, 255-260 : 2028 - +
[28] High-Dimensional Sparse Linear Bandits
Hao, Botao
Lattimore, Tor
Wang, Mengdi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[29] Information Directed Sampling for Sparse Linear Bandits
Hao, Botao
Lattimore, Tor
Deng, Wei
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[30] Linear Bayes policy for learning in contextual-bandits
Antonio Martin H, Jose
Vargas, Ana M.
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (18) : 7400 - 7406

← 1 2 3 4 5 →