An Empirical Relative Value Learning Algorithm for Non-parametric MDPs with Continuous State Space

被引:0
|
作者
Sharma, Hiteshi [1 ]
Jain, Rahul [1 ]
Gupta, Abhishek [2 ]
机构
[1] Univ Southern Calif, Dept Elect Engn, Los Angeles, CA 90007 USA
[2] Ohio State Univ, Dept Elect Engn, Columbus, OH 43210 USA
关键词
D O I
10.23919/ecc.2019.8795982
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We propose an empirical relative value learning (ERVL) algorithm for non-parametric MDPs with continuous state space and finite actions and average reward criterion. The ERVL algorithm relies on function approximation via nearest neighbors, and minibatch samples for value function update. It is universal (will work for any MDP), computationally quite simple and yet provides arbitrarily good approximation with high probability in finite time. This is the first such algorithm for non-parametric (and continuous state space) MDPs with average reward criteria with these provable properties as far as we know. Numerical evaluation on a benchmark problem of optimal replacement suggests good performance.
引用
收藏
页码:1368 / 1373
页数:6
相关论文
共 50 条
  • [21] LEARNING NON-PARAMETRIC MODELS OF PRONUNCIATION
    Hutchinson, Brian
    Droppo, Jasha
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 4904 - 4907
  • [22] Non-parametric Representation Learning with Kernels
    Esser, Pascal
    Fleissner, Maximilian
    Ghoshdastidar, Debarghya
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 11, 2024, : 11910 - 11918
  • [23] Imitation Learning with Non-Parametric Regression
    Vaandrager, Maarten
    Babuska, Robert
    Busoniu, Lucian
    Lopes, Gabriel A. D.
    2012 IEEE INTERNATIONAL CONFERENCE ON AUTOMATION, QUALITY AND TESTING, ROBOTICS, THETA 18TH EDITION, 2012, : 91 - 96
  • [24] Integrating relative survival in multi-state models-a non-parametric approach
    Manevski, Damjan
    Putter, Hein
    Perme, Maja Pohar
    Bonneville, Edouard F.
    Schetelig, Johannes
    de Wreede, Liesbeth C.
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2022, 31 (06) : 997 - 1012
  • [25] Riemannian geometry on non-parametric probability space
    Combe-Nencka, H.
    Combe, Ph.
    NEURAL NETWORK WORLD, 2006, 16 (06) : 459 - 473
  • [26] Statistical non-parametric mapping in sensor space
    Wagner M.
    Tech R.
    Fuchs M.
    Kastner J.
    Gasca F.
    Biomedical Engineering Letters, 2017, 7 (3) : 193 - 203
  • [27] Unified parametric and non-parametric ICA algorithm for arbitrary sources
    Wang, Fasong
    Li, Hongwei
    Li, Rui
    Yu, Shaoquan
    ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 1, 2006, 3971 : 1121 - 1126
  • [28] A non-parametric comparison algorithm for simulation optimization
    Alkhamis, TM
    Ahmed, MA
    7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL XIV, PROCEEDINGS: COMPUTER SCIENCE, ENGINEERING AND APPLICATIONS, 2003, : 402 - 407
  • [29] A fast non-parametric density estimation algorithm
    Egecioglu, O
    Srinivasan, A
    COMMUNICATIONS IN NUMERICAL METHODS IN ENGINEERING, 1997, 13 (10): : 755 - 763
  • [30] Robustness of the parametric MLSE algorithm against non-parametric channels
    Chen, JT
    GLOBECOM 98: IEEE GLOBECOM 1998 - CONFERENCE RECORD, VOLS 1-6: THE BRIDGE TO GLOBAL INTEGRATION, 1998, : 142 - 147