Sharing Information in Adversarial Bandit

被引:0
|
作者
St-Pierre, David L. [1 ,2 ]
Teytaud, Olivier [2 ]
机构
[1] Univ Liege, Inst Montefiore, Dept Elect Engn & Comp Sci, B-4000 Liege, Belgium
[2] Univ Paris 11, TAO, Inria, UMR CNRS 8623, Paris, France
来源
关键词
Bandit problem; Monte-Carlo; Nash Equilibrium; Games;
D O I
10.1007/978-3-662-45523-4_32
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
2-Player games in general provide a popular platform for research in Artificial Intelligence (AI). One of the main challenges coming from this platform is approximating a Nash Equilibrium (NE) over zero-sum matrix games. While the problem of computing such a Nash Equilibrium is solvable in polynomial time using Linear Programming (LP), it rapidly becomes infeasible to solve as the size of the matrix grows; a situation commonly encountered in games. This paper focuses on improving the approximation of a NE for matrix games such that it outperforms the state-of-the-art algorithms given a finite (and rather small) number T of oracle requests to rewards. To reach this objective, we propose to share information between the different relevant pure strategies. We show both theoretically by improving the bound and empirically by experiments on artificial matrices and on a real-world game that information sharing leads to an improvement of the approximation of the NE.
引用
收藏
页码:386 / 398
页数:13
相关论文
共 50 条
  • [1] Relay Selection with No Side Information: An Adversarial Bandit Approach
    Maghsudi, Setareh
    Stanczak, Slawomir
    [J]. 2013 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2013, : 715 - 720
  • [2] Clustering of Bandit with Frequency-Dependent Information Sharing
    Yang, Shen
    Zhou, Qifeng
    Wang, Qing
    [J]. ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II, 2023, 13981 : 274 - 287
  • [3] Secure Information Sharing in Adversarial Adaptive Diffusion Networks
    Ntemos, Konstantinos
    Plata-Chaves, Jorge
    Kolokotronis, Nicholas
    Kalouptsidis, Nicholas
    Moonen, Marc
    [J]. IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2018, 4 (01): : 111 - 124
  • [4] The Nash and the Bandit Approaches for Adversarial Portfolios
    St-Pierre, David L.
    Teytaud, Olivier
    [J]. 2014 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND GAMES (CIG), 2014,
  • [5] Meta-Learning Adversarial Bandit Algorithms
    Khodak, Mikhail
    Osadchiy, Ilya
    Harris, Keegan
    Balcan, Maria-Florina
    Levy, Kfir Y.
    Meir, Ron
    Wu, Zhiwei Steven
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [6] Algorithms for Adversarial Bandit Problems with Multiple Plays
    Uchiya, Taishi
    Nakamura, Atsuyoshi
    Kudo, Mineichi
    [J]. ALGORITHMIC LEARNING THEORY, ALT 2010, 2010, 6331 : 375 - 389
  • [7] Impact of Trust Management and Information Sharing to Adversarial Cost in Ranking Systems
    Le-Hung Vu
    Papaioannou, Thanasis G.
    Aberer, Karl
    [J]. TRUST MANAGEMENT IV, 2010, 321 : 108 - 124
  • [8] Safety-Aware Algorithms for Adversarial Contextual Bandit
    Sun, Wen
    Dey, Debadeepta
    Kapoor, Ashish
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [9] Achieving Privacy in the Adversarial Multi-Armed Bandit
    Tossou, Aristide C. Y.
    Dimitrakakis, Christos
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2653 - 2659
  • [10] Quantum Bandit With Amplitude Amplification Exploration in an Adversarial Environment
    Cho, Byungjin
    Xiao, Yu
    Hui, Pan
    Dong, Daoyi
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (01) : 311 - 317