Relative Q-Learning for Average-Reward Markov Decision Processes with Continuous States

被引:1
|
作者
Yang, Xiangyu [1 ]
Hu, Jiaqiao [2 ]
Hu, Jian-Qiang [3 ]
机构
[1] Shandong University, School of Management, Jinan,250100, China
[2] State University of New York, Department of Applied Mathematics & Statistics, Stony Brook,NY,11794, United States
[3] Fudan University, School of Management, Shanghai,200433, China
基金
中国博士后科学基金; 美国国家科学基金会; 中国国家自然科学基金;
关键词
Approximation algorithms - Behavioral research - Decision making - Learning algorithms - Online systems - Reinforcement learning - Uncertainty analysis;
D O I
10.1109/TAC.2024.3371380
中图分类号
学科分类号
摘要
引用
收藏
页码:6546 / 6560
相关论文
共 50 条
  • [1] Learning and Planning in Average-Reward Markov Decision Processes
    Wan, Yi
    Naik, Abhishek
    Sutton, Richard S.
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7665 - 7676
  • [2] Robust Average-Reward Markov Decision Processes
    Wang, Yue
    Velasquez, Alvaro
    Atia, George
    Prater-Bennette, Ashley
    Zou, Shaofeng
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 15215 - 15223
  • [3] Average-Reward Decentralized Markov Decision Processes
    Petrik, Marek
    Zilberstein, Shlomo
    [J]. 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2007, : 1997 - 2002
  • [4] REVERSIBLE MARKOV DECISION PROCESSES WITH AN AVERAGE-REWARD CRITERION
    Cogill, Randy
    Peng, Cheng
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2013, 51 (01) : 402 - 418
  • [5] Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints
    Chen, Liyu
    Jain, Rahul
    Luo, Haipeng
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [6] Finite Sample Analysis of Average-Reward TD Learning and Q-Learning
    Zhang, Sheng
    Zhang, Zhe
    Maguluri, Siva Theja
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] Sharper Model-free Reinforcement Learning for Average-reward Markov Decision Processes
    Zhang, Zihan
    Xie, Qiaomin
    [J]. THIRTY SIXTH ANNUAL CONFERENCE ON LEARNING THEORY, VOL 195, 2023, 195
  • [8] Incremental Improvements of Heuristic Policies for Average-Reward Markov Decision Processes
    Reveliotis, S.
    Ibrahim, M.
    [J]. IFAC PAPERSONLINE, 2020, 53 (02): : 1721 - 1728
  • [9] Approximate Relative Value Learning for Average-reward Continuous State MDPs
    Sharma, Hiteshi
    Jafarnia-Jahromi, Mehdi
    Jain, Rahul
    [J]. 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 956 - 964
  • [10] NECESSARY CONDITIONS FOR THE OPTIMALITY EQUATION IN AVERAGE-REWARD MARKOV DECISION-PROCESSES
    CAVAZOSCADENA, R
    [J]. APPLIED MATHEMATICS AND OPTIMIZATION, 1989, 19 (01): : 97 - 112