Strategizing against No-regret Learners

被引：0

作者：

Deng, Yuan ^{[1
]}

Schneider, Jon ^{[2
]}

Sivan, Balasubramanian ^{[2
]}

机构：

[1] Duke Univ, Durham, NC 27706 USA

[2] Google Res, Mountain View, CA USA

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019) | 2019年 / 32卷

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

How should a player who repeatedly plays a game against a no-regret learner strategize to maximize his utility? We study this question and show that under some mild assumptions, the player can always guarantee himself a utility of at least what he would get in a Stackelberg equilibrium of the game. When the no-regret learner has only two actions, we show that the player cannot get any higher utility than the Stackelberg equilibrium utility. But when the no-regret learner has more than two actions and plays a mean-based no-regret strategy, we show that the player can get strictly higher than the Stackelberg equilibrium utility. We provide a characterization of the optimal game-play for the player against a mean-based no-regret learner as a solution to a control problem. When the no-regret learner's strategy also guarantees him a no-swap regret, we show that the player cannot get anything higher than a Stackelberg equilibrium utility.

引用

页数：9

共 50 条

[21] No-Regret Learning in Dynamic Stackelberg Games
Lauffer, Niklas
Ghasemi, Mahsa
Hashemi, Abolfazl
Savas, Yagiz
Topcu, Ufuk
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (03) : 1418 - 1431
[22] Acceleration through Optimistic No-Regret Dynamics
Wang, Jun-Kun
Abernethy, Jacob
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[23] Unifying convergence and no-regret in multiagent learning
Banerjee, Bikramjit
Peng, Jing
LEARNING AND ADAPTION IN MULTI-AGENT SYSTEMS, 2006, 3898 : 100 - 114
[24] Calibration and Internal No-Regret with Random Signals
Perchet, Vianney
ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2009, 5809 : 68 - 82
[25] No-regret Algorithms for Fair Resource Allocation
Sinha, Abhishek
Joshi, Ativ
Bhattacharjee, Rajarshi
Musco, Cameron
Hajiesmaili, Mohammad
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[26] No-regret bayesian optimization with unknown hyperparameters
Berkenkamp, Felix
Schoellig, Angela P.
Krause, Andreas
Journal of Machine Learning Research, 2019, 20
[27] Manipulation Game Considering No-Regret Strategies
Clempner, Julio B.
MATHEMATICS, 2025, 13 (02)
[28] Strategizing Against Q-Learners: A Control-Theoretical Approach
Arslantas, Yuksel
Yuceel, Ege
Sayin, Muhammed O.
IEEE CONTROL SYSTEMS LETTERS, 2024, 8 : 1733 - 1738
[29] Weighted Voting Via No-Regret Learning
Haghtalab, Nika
Noothigattu, Ritesh
Procaccia, Ariel D.
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 1055 - 1062
[30] No-Regret Bayesian Optimization with Unknown Hyperparameters
Berkenkamp, Felix
Schoellig, Angela P.
Krause, Andreas
JOURNAL OF MACHINE LEARNING RESEARCH, 2019, 20

← 1 2 3 4 5 →