Sharing Information in Adversarial Bandit

被引：0

作者：

St-Pierre, David L. ^{[1
,2
]}

Teytaud, Olivier ^{[2
]}

机构：

[1] Univ Liege, Inst Montefiore, Dept Elect Engn & Comp Sci, B-4000 Liege, Belgium

[2] Univ Paris 11, TAO, Inria, UMR CNRS 8623, Paris, France

来源：

APPLICATIONS OF EVOLUTIONARY COMPUTATION | 2014年 / 8602卷

关键词：

Bandit problem; Monte-Carlo; Nash Equilibrium; Games;

D O I：

10.1007/978-3-662-45523-4_32

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

2-Player games in general provide a popular platform for research in Artificial Intelligence (AI). One of the main challenges coming from this platform is approximating a Nash Equilibrium (NE) over zero-sum matrix games. While the problem of computing such a Nash Equilibrium is solvable in polynomial time using Linear Programming (LP), it rapidly becomes infeasible to solve as the size of the matrix grows; a situation commonly encountered in games. This paper focuses on improving the approximation of a NE for matrix games such that it outperforms the state-of-the-art algorithms given a finite (and rather small) number T of oracle requests to rewards. To reach this objective, we propose to share information between the different relevant pure strategies. We show both theoretically by improving the bound and empirically by experiments on artificial matrices and on a real-world game that information sharing leads to an improvement of the approximation of the NE.

引用

页码：386 / 398

页数：13

共 50 条

[1] Relay Selection with No Side Information: An Adversarial Bandit Approach
Maghsudi, Setareh
Stanczak, Slawomir
[J]. 2013 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2013, : 715 - 720
[2] Clustering of Bandit with Frequency-Dependent Information Sharing
Yang, Shen
Zhou, Qifeng
Wang, Qing
[J]. ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II, 2023, 13981 : 274 - 287
[3] Secure Information Sharing in Adversarial Adaptive Diffusion Networks
Ntemos, Konstantinos
Plata-Chaves, Jorge
Kolokotronis, Nicholas
Kalouptsidis, Nicholas
Moonen, Marc
[J]. IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2018, 4 (01): : 111 - 124
[4] The Nash and the Bandit Approaches for Adversarial Portfolios
St-Pierre, David L.
Teytaud, Olivier
[J]. 2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG), 2014,
[5] Meta-Learning Adversarial Bandit Algorithms
Khodak, Mikhail
Osadchiy, Ilya
Harris, Keegan
Balcan, Maria-Florina
Levy, Kfir Y.
Meir, Ron
Wu, Zhiwei Steven
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[6] Algorithms for Adversarial Bandit Problems with Multiple Plays
Uchiya, Taishi
Nakamura, Atsuyoshi
Kudo, Mineichi
[J]. ALGORITHMIC LEARNING THEORY, ALT 2010, 2010, 6331 : 375 - 389
[7] Impact of Trust Management and Information Sharing to Adversarial Cost in Ranking Systems
Le-Hung Vu
Papaioannou, Thanasis G.
Aberer, Karl
[J]. TRUST MANAGEMENT IV, 2010, 321 : 108 - 124
[8] Safety-Aware Algorithms for Adversarial Contextual Bandit
Sun, Wen
Dey, Debadeepta
Kapoor, Ashish
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
[9] Achieving Privacy in the Adversarial Multi-Armed Bandit
Tossou, Aristide C. Y.
Dimitrakakis, Christos
[J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2653 - 2659
[10] Quantum Bandit With Amplitude Amplification Exploration in an Adversarial Environment
Cho, Byungjin
Xiao, Yu
Hui, Pan
Dong, Daoyi
[J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (01) : 311 - 317

← 1 2 3 4 5 →