Multi-armed linear bandits with latent biases

被引:0
|
作者
Kang, Qiyu [1 ]
Tay, Wee Peng [1 ]
She, Rui [1 ]
Wang, Sijie [1 ]
Liu, Xiaoqian [2 ]
Yang, Yuan-Rui [1 ]
机构
[1] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore 639798, Singapore
[2] China Univ Polit Sci & Law, Sch Sociol, Beijing, Peoples R China
关键词
Linear bandit; Multi-armed bandit; Latent bias; REWARDS; RECOMMENDATION; MODELS;
D O I
10.1016/j.ins.2024.120103
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In a linear stochastic bandit model, each arm corresponds to a vector in Euclidean space, and the expected return observed at each time step is determined by an unknown linear function of the selected arm. This paper addresses the challenge of identifying the optimal arm in a linear stochastic bandit model, where latent biases corrupt each arm's expected reward. Unlike traditional linear bandit problems, where the observed return directly represents the reward, this paper considers a scenario where the unbiased reward at each time step remains unobservable. This model is particularly relevant in situations where the observed return is influenced by latent biases that need to be carefully excluded from the learning model. For example, in recommendation systems designed to prevent racially discriminatory suggestions, it is crucial to ensure that the users' race does not influence the system. However, the observed return, such as click -through rates, may have already been influenced by racial attributes. In the case where there are finitely many arms, we develop a strategy to achieve O(|������ | log n) regret, where |������ | is the number of arms and n is the number of time steps. In the case where each arm is chosen from an infinite compact set, our strategy achieves O(n2/3(log n)1/2) regret. Experiments verify the efficiency of our strategy.
引用
收藏
页数:19
相关论文
共 50 条
  • [1] Multi-armed linear bandits with latent biases
    Kang, Qiyu
    Tay, Wee Peng
    She, Rui
    Wang, Sijie
    Liu, Xiaoqian
    Yang, Yuan-Rui
    Information Sciences, 2024, 660
  • [2] CORRELATED MULTI-ARMED BANDITS WITH A LATENT RANDOM SOURCE
    Gupta, Samarth
    Joshi, Gauri
    Yagan, Osman
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3572 - 3576
  • [3] Context Enhancement for Linear Contextual Multi-Armed Bandits
    Gutowski, Nicolas
    Amghar, Tassadit
    Camp, Olivier
    Chhel, Fabien
    2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 1048 - 1055
  • [4] On Kernelized Multi-armed Bandits
    Chowdhury, Sayak Ray
    Gopalan, Aditya
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [5] Multi-armed Bandits with Compensation
    Wang, Siwei
    Huang, Longbo
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [6] Regional Multi-Armed Bandits
    Wang, Zhiyang
    Zhou, Ruida
    Shen, Cong
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [7] Federated Multi-Armed Bandits
    Shi, Chengshuai
    Shen, Cong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 9603 - 9611
  • [8] Multi-armed Bandits with Probing
    Elumar, Eray Can
    Tekin, Cem
    Yagan, Osman
    2024 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, ISIT 2024, 2024, : 2080 - 2085
  • [9] Ballooning multi-armed bandits
    Ghalme, Ganesh
    Dhamal, Swapnil
    Jain, Shweta
    Gujar, Sujit
    Narahari, Y.
    ARTIFICIAL INTELLIGENCE, 2021, 296
  • [10] Quantum Multi-Armed Bandits and Stochastic Linear Bandits Enjoy Logarithmic Regrets
    Wan, Zongqi
    Zhang, Zhijie
    Li, Tongyang
    Zhang, Jialin
    Sun, Xiaoming
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 10087 - 10094