共 50 条
- [41] MARKOV DECISION-PROCESSES WITH BOTH CONTINUOUS AND IMPULSIVE CONTROL LECTURE NOTES IN CONTROL AND INFORMATION SCIENCES, 1986, 81 : 234 - 246
- [42] A Duality Approach for Regret Minimization in Average-Reward Ergodic Markov Decision Processes LEARNING FOR DYNAMICS AND CONTROL, VOL 120, 2020, 120 : 862 - 883
- [43] BATCH POLICY LEARNING IN AVERAGE REWARD MARKOV DECISION PROCESSES ANNALS OF STATISTICS, 2022, 50 (06): : 3364 - 3387
- [44] Learning and Planning in Average-Reward Markov Decision Processes INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7665 - 7676
- [45] Bounded parameter Markov decision processes with average reward criterion LEARNING THEORY, PROCEEDINGS, 2007, 4539 : 263 - +
- [46] Pseudometrics for state aggregation in average reward Markov decision processes ALGORITHMIC LEARNING THEORY, PROCEEDINGS, 2007, 4754 : 373 - 387
- [48] APPROXIMATING THE MARKOV PROPERTY IN MARKOV DECISION-PROCESSES INFORMATION AND DECISION TECHNOLOGIES, 1989, 15 (03): : 147 - 162