Choosing Well Your Opponents: How to Guide the Synthesis of Programmatic Strategies

被引:0
|
作者
Moraes, Rubens O. [1 ,2 ,3 ]
Aleixo, David S. [1 ]
Ferreira, Lucas N. [2 ,3 ]
Lelis, Levi H. S. [2 ,3 ]
机构
[1] Univ Fed Vicosa, Dept Informat, Vicosa, Brazil
[2] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
[3] Alberta Machine Intelligence Inst Amii, Edmonton, AB, Canada
基金
加拿大自然科学与工程研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces Local Learner (2L), an algorithm for providing a set of reference strategies to guide the search for programmatic strategies in two-player zero-sum games. Previous learning algorithms, such as Iterated Best Response (IBR), Fictitious Play (FP), and Double-Oracle (DO), can be computationally expensive or miss important information for guiding search algorithms. 2L actively selects a set of reference strategies to improve the search signal. We empirically demonstrate the advantages of our approach while guiding a local search algorithm for synthesizing strategies in three games, including MicroRTS, a challenging real-time strategy game. Results show that 2L learns reference strategies that provide a stronger search signal than IBR, FP, and DO. We also simulate a tournament of MicroRTS, where a synthesizer using 2L outperformed the winners of the two latest MicroRTS competitions, which were programmatic strategies written by human programmers.
引用
收藏
页码:4847 / 4854
页数:8
相关论文
共 50 条