共 7 条
- [1] Approximate value iteration and temporal-difference learning IEEE 2000 ADAPTIVE SYSTEMS FOR SIGNAL PROCESSING, COMMUNICATIONS, AND CONTROL SYMPOSIUM - PROCEEDINGS, 2000, : 48 - 51
- [3] On the Existence of Fixed Points for Approximate Value Iteration and Temporal-Difference Learning Journal of Optimization Theory and Applications, 2000, 105 : 589 - 608
- [4] Eigensubspace of Temporal-Difference Dynamics and How It Improves Value Approximation in Reinforcement Learning MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, ECML PKDD 2023, PT IV, 2023, 14172 : 573 - 589
- [5] Intentionally-underestimated value function at terminal state for temporal-difference learning with mis-designed reward RESULTS IN CONTROL AND OPTIMIZATION, 2025, 18
- [6] A temporal-difference learning method using gaussian state representation for continuous state space problems 1600, Japanese Society for Artificial Intelligence (29):
- [7] Adaptive Temporal-Difference Learning for Policy Evaluation with Per-State Uncertainty Estimates ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32