A Fast and Reliable Policy Improvement Algorithm

被引:0
|
作者
Abbasi-Yadkori, Yasin [1 ]
Bartlett, Peter L. [1 ,2 ]
Wright, Stephen J. [3 ]
机构
[1] Queensland Univ Technol, Brisbane, Qld, Australia
[2] Univ Calif Berkeley, Berkeley, CA USA
[3] Univ Wisconsin, Madison, WI 53706 USA
基金
澳大利亚研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a simple, efficient method that improves stochastic policies for Markov decision processes. The computational complexity is the same as that of the value estimation problem. We prove that when the value estimation error is small, this method gives an improvement in performance that increases with certain variance properties of the initial policy and transition dynamics. Performance in numerical experiments compares favorably with previous policy improvement algorithms.
引用
收藏
页码:1338 / 1346
页数:9
相关论文
共 50 条
  • [21] Research and improvement of feature detection algorithm based on FAST
    Li, Yulin
    Zheng, Wenfeng
    Liu, Xiangjun
    Mou, Yuanyuan
    Yin, Lirong
    Yang, Bo
    RENDICONTI LINCEI-SCIENZE FISICHE E NATURALI, 2021, 32 (04) : 775 - 789
  • [22] A Fast Tabu Search Algorithm for the Reliable P-Median Problem
    Li, Qingwei
    Savachkin, Alex
    ADVANCES IN GLOBAL OPTIMIZATION, 2015, 95 : 417 - 424
  • [23] FAST AND RELIABLE NOISE ESTIMATION ALGORITHM BASED ON STATISTICAL HYPOTHESIS TESTS
    Jiang, Ping
    Zhang, Jian-Zhou
    2012 IEEE VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2012,
  • [24] A fast, highly reliable data compression chip and algorithm for storage systems
    Cheng, JM
    Duyanovich, LM
    Craft, DJ
    IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1996, 40 (06) : 603 - 613
  • [25] A FAST AND RELIABLE STATE ESTIMATION ALGORITHM FOR AEPS NEW CONTROL CENTER
    ALLEMONG, JJ
    RADU, L
    SASSON, AM
    IEEE TRANSACTIONS ON POWER APPARATUS AND SYSTEMS, 1982, 101 (04): : 933 - 944
  • [26] HaploGrep: A Fast and Reliable Algorithm for Automatic Classification of Mitochondrial DNA Haplogroups
    Kloss-Brandstaetter, Anita
    Pacher, Dominic
    Schoenherr, Sebastian
    Weissensteiner, Hansi
    Binna, Robert
    Specht, Guenther
    Kronenberg, Florian
    HUMAN MUTATION, 2011, 32 (01) : 25 - 32
  • [27] Fast and reliable
    Wagner, Manfred
    Hofmann, Karsten
    Wireworld, 1994, 36 (02): : 16 - 18
  • [28] Qualitative path estimation: A fast and reliable algorithm for qualitative trend analysis
    Villez, Kris
    AICHE JOURNAL, 2015, 61 (05) : 1535 - 1546
  • [29] Fast and Reliable Tracking Algorithm for On-Road Vehicle Detection Systems
    Baek, Jang Woon
    Han, Byung-Gil
    Kang, Hyunwoo
    Chung, Yoonsu
    Lee, Su-In
    2016 EIGHTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN), 2016, : 70 - 72
  • [30] Dynamic Cache Allocation Algorithm and Replacement Policy for Reliable Multicast Network
    Zhang, Jingyu
    Li, Zhishu
    Chen, Liangyin
    2009 5TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-8, 2009, : 3975 - 3979