共 50 条
- [21] Learning Routines for Effective Off-Policy Reinforcement Learning INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
- [22] Safe and efficient off-policy reinforcement learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
- [23] Conditional Importance Sampling for Off-Policy Learning INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 45 - 54
- [24] Chaining Value Functions for Off-Policy Learning THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8187 - 8195
- [26] Bounds for Off-policy Prediction in Reinforcement Learning 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 3991 - 3997
- [27] Off-Policy Imitation Learning from Observations ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [28] The Pitfalls of Regularization in Off-Policy TD Learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
- [29] Off-Policy Learning-to-Bid with AuctionGym PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 4219 - 4228