共 50 条
- [1] RECURSIVE ADAPTIVE-CONTROL OF MARKOV DECISION-PROCESSES WITH THE AVERAGE REWARD CRITERION APPLIED MATHEMATICS AND OPTIMIZATION, 1991, 23 (02): : 193 - 207
- [5] A Unified Approach for Semi-Markov Decision Processes with Discounted and Average Reward Criteria 2014 11TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2014, : 1741 - 1744
- [6] NECESSARY CONDITIONS FOR THE OPTIMALITY EQUATION IN AVERAGE-REWARD MARKOV DECISION-PROCESSES APPLIED MATHEMATICS AND OPTIMIZATION, 1989, 19 (01): : 97 - 112
- [8] EXISTENCE OF OPTIMAL STATIONARY POLICIES IN AVERAGE REWARD MARKOV DECISION-PROCESSES WITH A RECURRENT STATE APPLIED MATHEMATICS AND OPTIMIZATION, 1992, 26 (02): : 171 - 194
- [9] Adaptive aggregation for reinforcement learning in average reward Markov decision processes Annals of Operations Research, 2013, 208 : 321 - 336