共 50 条
- [31] Safe Off-policy Reinforcement Learning Using Barrier Functions [J]. 2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 2176 - 2181
- [32] Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
- [33] Regret Minimization Experience Replay in Off-Policy Reinforcement Learning [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
- [34] Off-policy evaluation for tabular reinforcement learning with synthetic trajectories [J]. Statistics and Computing, 2024, 34
- [35] Off-Policy Policy Gradient with State Distribution Correction [J]. 35TH UNCERTAINTY IN ARTIFICIAL INTELLIGENCE CONFERENCE (UAI 2019), 2020, 115 : 1180 - 1190
- [36] Rethinking Population-assisted Off-policy Reinforcement Learning [J]. PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, GECCO 2023, 2023, : 624 - 632
- [38] Regularized Anderson Acceleration for Off-Policy Deep Reinforcement Learning [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
- [39] Stabilizing Off-Policy Deep Reinforcement Learning from Pixels [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
- [40] Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119