A Fast and Reliable Policy Improvement Algorithm

被引:0
|
作者
Abbasi-Yadkori, Yasin [1 ]
Bartlett, Peter L. [1 ,2 ]
Wright, Stephen J. [3 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld, Australia
[2] Univ Calif Berkeley, Berkeley, CA USA
[3] Univ Wisconsin, Madison, WI 53706 USA
基金
澳大利亚研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a simple, efficient method that improves stochastic policies for Markov decision processes. The computational complexity is the same as that of the value estimation problem. We prove that when the value estimation error is small, this method gives an improvement in performance that increases with certain variance properties of the initial policy and transition dynamics. Performance in numerical experiments compares favorably with previous policy improvement algorithms.
引用
收藏
页码:1338 / 1346
页数:9
相关论文
共 50 条
  • [1] Fast and reliable iris segmentation algorithm
    Radman, Abduljalil
    Jumari, Kasmiran
    Zainal, Nasharuddin
    IET IMAGE PROCESSING, 2013, 7 (01) : 42 - 49
  • [2] A NOVEL FAST AND RELIABLE THINNING ALGORITHM
    KALLES, D
    MORRIS, DT
    IMAGE AND VISION COMPUTING, 1993, 11 (09) : 588 - 603
  • [3] An improvement of the fast uncovering community algorithm
    Wang Li
    Wang Jiang
    Shen Hua-Wei
    Cheng Xue-Qi
    CHINESE PHYSICS B, 2013, 22 (10)
  • [4] An improvement of the fast uncovering community algorithm
    王莉
    王将
    沈华伟
    程学旗
    Chinese Physics B, 2013, (10) : 650 - 657
  • [5] On the policy improvement algorithm in continuous time
    Jacka, Saul D.
    Mijatovic, Aleksandar
    STOCHASTICS-AN INTERNATIONAL JOURNAL OF PROBABILITY AND STOCHASTIC REPORTS, 2017, 89 (01): : 348 - 359
  • [6] An adaptive algorithm for fast and reliable online saccade detection
    Richard Schweitzer
    Martin Rolfs
    Behavior Research Methods, 2020, 52 : 1122 - 1139
  • [7] An adaptive algorithm for fast and reliable online saccade detection
    Schweitzer, Richard
    Rolfs, Martin
    BEHAVIOR RESEARCH METHODS, 2020, 52 (03) : 1122 - 1139
  • [8] FAST, PORTABLE, AND RELIABLE ALGORITHM FOR THE CALCULATION OF HALTON NUMBERS
    KOLAR, M
    OSHEA, SF
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 1993, 25 (07) : 3 - 13
  • [9] Diverse Exploration for Fast and Safe Policy Improvement
    Cohen, Andrew
    Yu, Lei
    Wright, Robert
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 2876 - 2883
  • [10] An improvement of fast search algorithm for vector quantization
    Chung, MC
    Chen, SC
    Yu, CT
    Chen, PY
    ISPACS 2005: PROCEEDINGS OF THE 2005 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, 2005, : 97 - 100