共 50 条
- [1] Nearly Minimax Optimal Reinforcement Learning for Discounted MDPs ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
- [2] Minimax Regret Bounds for Reinforcement Learning INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
- [7] Reinforcement Learning in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [10] Reinforcement learning for MDPs with constraints MACHINE LEARNING: ECML 2006, PROCEEDINGS, 2006, 4212 : 646 - 653