Is Learning in Games Good for the Learners?

被引:0
|
作者
Brown, William [1 ]
Schneider, Jon [2 ]
Vodrahalli, Kiran [2 ]
机构
[1] Columbia Univ, New York, NY 10027 USA
[2] Google Res, Mountain View, CA USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider a number of questions related to tradeoffs between reward and regret in repeated gameplay between two agents. To facilitate this, we introduce a notion of generalized equilibrium which allows for asymmetric regret constraints, and yields polytopes of feasible values for each agent and pair of regret constraints, where we show that any such equilibrium is reachable by a pair of algorithms which maintain their regret guarantees against arbitrary opponents. As a central example, we highlight the case one agent is no-swap and the other's regret is unconstrained. We show that this captures an extension of Stackelberg equilibria with a matching optimal value, and that there exists a wide class of games where a player can significantly increase their utility by deviating from a no-swap-regret algorithm against a no-swap learner (in fact, almost any game without pure Nash equilibria is of this form). Additionally, we make use of generalized equilibria to consider tradeoffs in terms of the opponent's algorithm choice. We give a tight characterization for the maximal reward obtainable against some no-regret learner, yet we also show a class of games in which this is bounded away from the value obtainable against the class of common "mean-based" no-regret algorithms. Finally, we consider the question of learning reward-optimal strategies via repeated play with a no-regret agent when the game is initially unknown. Again we show tradeoffs depending on the opponent's learning algorithm: the Stackelberg strategy is learnable in exponential time with any no-regret agent (and in polynomial time with any no-adaptive-regret agent) for any game where it is learnable via queries, and there are games where it is learnable in polynomial time against any no-swap-regret agent but requires exponential time against a mean-based no-regret agent.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Are Video Games Good for Learning?
    Gee, James Paul
    [J]. NORDIC JOURNAL OF DIGITAL LITERACY, 2006, 1 (03) : 172 - 183
  • [2] The Activation of Learners' Metacognition to Promote Learning Autonomy of Good Language Learners
    Rahman, Aam Ali
    Angraeni, Anggi
    Fauzi, Rizal Ahmad
    [J]. PEGEM EGITIM VE OGRETIM DERGISI, 2021, 11 (04): : 249 - 253
  • [3] Learning From Good Learners: A Case Study
    Adams, Lesley
    Gasparini, Erik
    [J]. STUDIES IN SELF-ACCESS LEARNING JOURNAL, 2024, 15 (02):
  • [4] Young learners' language learning via computer games
    Turgut, Yildiz
    Irgin, Pelin
    [J]. WORLD CONFERENCE ON EDUCATIONAL SCIENCES - NEW TRENDS AND ISSUES IN EDUCATIONAL SCIENCES, 2009, 1 (01): : 760 - 764
  • [5] Online games for young learners' foreign language learning
    Butler, Yuko Goto
    Someya, Yuumi
    Fukuhara, Eiji
    [J]. ELT JOURNAL, 2014, 68 (03) : 265 - 275
  • [6] What Makes Good Learners: learning strategies in writing
    杨柳
    [J]. 校园英语, 2015, (22) : 11 - 13
  • [7] Playfulness, games and playful learning to promote good
    Kangas, Marjaana
    Siklander, Signe
    [J]. FRONTIERS IN EDUCATION, 2023, 8
  • [8] Spanish learning strategies of some good language learners
    Morales, Manuel
    Smith, Daniel J.
    [J]. PORTA LINGUARUM, 2008, (09) : 167 - 177
  • [9] Good and Similar Learners' Recommendation in Adaptive Learning Systems
    Nurjanah, Dade
    [J]. PROCEEDINGS OF THE 8TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED EDUCATION, VOL 1 (CSEDU), 2016, : 434 - 440
  • [10] GOOD LEARNERS AND POOR LEARNERS
    HOWE, MJA
    [J]. BULLETIN OF THE BRITISH PSYCHOLOGICAL SOCIETY, 1976, 29 (JAN): : 16 - 19