共 50 条
- [21] The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes APPLIED MATHEMATICS AND OPTIMIZATION, 2010, 62 (02): : 185 - 204
- [22] The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, : 506 - 511
- [23] The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes Applied Mathematics & Optimization, 2010, 62 : 185 - 204
- [26] Advantage Based Value Iteration for Markov Decision Processes with Unknown Rewards 2016 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2016, : 3837 - 3844
- [27] Uniform convergence of value iteration policies for discounted Markov decision processes BOLETIN DE LA SOCIEDAD MATEMATICA MEXICANA, 2006, 12 (01): : 133 - 148