共 50 条
- [23] On the Convergence of Temporal-Difference Learning with Linear Function Approximation Machine Learning, 2001, 42 : 241 - 267
- [25] Policy Gradient With Value Function Approximation For Collective Multiagent Planning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
- [26] A policy gradient reinforcement learning algorithm with fuzzy function approximation IEEE ROBIO 2004: Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2004, : 936 - 940
- [28] Improving Gaussian Process Value Function Approximation in Policy Gradient Algorithms ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2011, PT II, 2011, 6792 : 221 - +
- [29] Least squares policy evaluation algorithms with linear function approximation DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2003, 13 (1-2): : 79 - 110
- [30] Least Squares Policy Evaluation Algorithms with Linear Function Approximation Discrete Event Dynamic Systems, 2003, 13 : 79 - 110