Improving Exploration in UCT Using Local Manifolds

被引:0
|
作者
Srinivasan, Sriram [1 ]
Talvitie, Erik [2 ]
Bowling, Michael [1 ]
机构
[1] Univ Alberta, Edmonton, AB, Canada
[2] Franklin & Marshall Coll, Lancaster, PA 17604 USA
关键词
SEARCH;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monte Carlo planning has been proven successful in many sequential decision-making settings, but it suffers from poor exploration when the rewards are sparse. In this paper, we improve exploration in UCT by generalizing across similar states using a given distance metric. When the state space does not have a natural distance metric, we show how we can learn a local manifold from the transition graph of states in the near future. to obtain a distance metric. On domains inspired by video games, empirical evidence shows that our algorithm is more sample efficient than UCT, particularly when rewards are sparse.
引用
收藏
页码:3386 / 3392
页数:7
相关论文
共 50 条
  • [1] Understanding and Improving Local Exploration for GBFS
    Xie, Fan
    Mueller, Martin
    Holte, Robert
    PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON AUTOMATED PLANNING AND SCHEDULING, 2015, : 244 - 248
  • [2] Improving UCT Planning via Approximate Homomorphisms
    Jiang, Nan
    Singh, Satinder
    Lewis, Richard
    AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2014, : 1289 - 1296
  • [3] Improving PAC Exploration Using the Median of Means
    Pazis, Jason
    Parr, Ronald
    How, Jonathan P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [4] USING THE MPPA ARCHITECTURE FOR UCT PARALLELIZATION
    Hufschmitt, Aline
    Mehat, Jean
    Vittaut, Jean-Noel
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCES ON INTERFACES AND HUMAN COMPUTER INTERACTION 2015, GAME AND ENTERTAINMENT TECHNOLOGIES 2015 AND COMPUTER GRAPHICS, VISUALIZATION, COMPUTER VISION AND IMAGE PROCESSING 2015, 2015, : 109 - 115
  • [5] Knowledge Generation for Improving Simulations in UCT for General Game Playing
    Sharma, Shiven
    Kobti, Ziad
    Goodwin, Scott
    AI 2008: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2008, 5360 : 49 - 55
  • [6] UCT CORRELATION USING THE BHATTACHARYYA DIVERGENCE
    Hussein, Islam I.
    Roscoe, Christopher W. T.
    Schumacher, Paul W., Jr.
    Wilkins, Matthew P.
    SPACEFLIGHT MECHANICS 2016, PTS I-IV, 2016, 158 : 4015 - 4031
  • [7] UCT initiates research into improving South African permeable paving performance
    Beer, David
    Betonwerk und Fertigteil-Technik/Concrete Plant and Precast Technology, 2019, 85 (09): : 48 - 53
  • [8] Local ordering on manifolds
    Krym, V.R.
    Petrov, N.N.
    Vestnik Sankt-Peterburgskogo Universiteta. Ser 1. Matematika Mekhanika Astronomiya, 2001, (03): : 32 - 36
  • [9] To Create Adaptive Game Opponent by Using UCT
    He, Suoju
    Xie, Fan
    Wang, Yi
    Luo, Sai
    Fu, Yiwen
    Yang, Jiajian
    Liu, Zhiqing
    Zhu, Qiliang
    2008 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE FOR MODELLING CONTROL & AUTOMATION, VOLS 1 AND 2, 2008, : 67 - 70
  • [10] LOCAL SIMILARITY MANIFOLDS
    VAISMAN, I
    REISCHER, C
    ANNALI DI MATEMATICA PURA ED APPLICATA, 1983, 135 : 279 - 291