共 50 条
- [1] A Nonparametric Off-Policy Policy Gradient [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
- [2] Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning Shixiang [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
- [3] Off-policy and on-policy reinforcement learning with the Tsetlin machine [J]. Applied Intelligence, 2023, 53 : 8596 - 8613
- [5] Safe and efficient off-policy reinforcement learning [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
- [6] Bounds for Off-policy Prediction in Reinforcement Learning [J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 3991 - 3997
- [8] Off-Policy Reinforcement Learning with Delayed Rewards [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
- [10] A perspective on off-policy evaluation in reinforcement learning [J]. Frontiers of Computer Science, 2019, 13 : 911 - 912