共 14 条
- [1] Infinite-horizon policy-gradient estimation [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 15 : 319 - 350
- [2] Experiments with infinite-horizon, policy-gradient estimation [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2001, 15 : 351 - 381
- [3] KONDA VR, 1999, NEURAL INFORM PROCES
- [5] RIEDMILLER M, 2000, J NEURAL COMPUTING A, V8, P323
- [6] Reinforcement learning for RoboCup soccer keepaway [J]. ADAPTIVE BEHAVIOR, 2005, 13 (03) : 165 - 188
- [7] STONE P, 2001, ROBOCUP 2000 ROB SOC, V201, P249
- [8] STONE P, 2001, P 5 INT C AUT AG NY
- [9] STONE P, 2001, P 18 INT C MACH LEAR
- [10] Sutton R. S., 1998, Reinforcement Learning: An Introduction, V22447