Solving semi-Markov decision problems using average reward reinforcement learning

被引:0
|
作者
Dept. Indust. and Mgmt. Syst. Eng., University of South Florida, Tampa, FL 33620, United States [1 ]
不详 [2 ]
不详 [3 ]
机构
来源
Manage Sci | / 4卷 / 560-574期
关键词
D O I
暂无
中图分类号
学科分类号
摘要
37
引用
收藏
相关论文
共 50 条
  • [11] Adaptive aggregation for reinforcement learning in average reward Markov decision processes
    Ortner, Ronald
    ANNALS OF OPERATIONS RESEARCH, 2013, 208 (01) : 321 - 336
  • [12] SEMI-MARKOV DECISION-PROCESSES WITH POLYNOMIAL REWARD
    ROSBERG, Z
    JOURNAL OF APPLIED PROBABILITY, 1982, 19 (02) : 301 - 309
  • [13] Continuous-time Markov Decision Process with Average Reward: Using Reinforcement Learning Method
    Jia, Shengde
    Shen, Lincheng
    Xue, Hongtao
    2015 34TH CHINESE CONTROL CONFERENCE (CCC), 2015, : 3097 - 3100
  • [14] Learning to maximize reward rate: a model based on semi-Markov decision processes
    Khodadadi, Arash
    Fakhari, Pegah
    Busemeyer, Jerome R.
    FRONTIERS IN NEUROSCIENCE, 2014, 8
  • [15] Risk-Sensitivity and Average Optimality in Markov and Semi-Markov Reward Processes
    Sladky, Karel
    38TH INTERNATIONAL CONFERENCE ON MATHEMATICAL METHODS IN ECONOMICS (MME 2020), 2020, : 537 - 543
  • [16] Constrained semi-markov decision processes with average rewards
    Feinberg, E.A.
    ZOR. Zeitschrift Fuer Operations Research, 1994, 40 (03):
  • [17] Semi-Markov Offline Reinforcement Learning for Healthcare
    Fatemi, Mehdi
    Wu, Mary
    Petch, Jeremy
    Nelson, Walter
    Connolly, Stuart J.
    Benz, Alexander
    Carnicelli, Anthony
    Ghassemi, Marzyeh
    CONFERENCE ON HEALTH, INFERENCE, AND LEARNING, VOL 174, 2022, 174 : 119 - 137
  • [18] Adaptive Honeypot Engagement Through Reinforcement Learning of Semi-Markov Decision Processes
    Huang, Linan
    Zhu, Quanyan
    DECISION AND GAME THEORY FOR SECURITY, 2019, 11836 : 196 - 216
  • [19] Semi-Markov and reward fields
    Soltani, A. R.
    Ghasemi, H.
    STATISTICS & PROBABILITY LETTERS, 2014, 95 : 71 - 76
  • [20] MAXIMAL AVERAGE-REWARD POLICIES FOR SEMI-MARKOV DECISION PROCESSES WITH ARBITRARY STATE AND ACTION SPACE
    LIPPMAN, SA
    ANNALS OF MATHEMATICAL STATISTICS, 1971, 42 (05): : 1717 - &