共 50 条
- [31] More Efficient Off-Policy Evaluation through Regularized Targeted Learning INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
- [32] Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
- [33] Regret Minimization Experience Replay in Off-Policy Reinforcement Learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
- [34] Off-policy evaluation for tabular reinforcement learning with synthetic trajectories Statistics and Computing, 2024, 34
- [35] Off-Policy Evaluation via the Regularized Lagrangian ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [36] Rethinking Population-assisted Off-policy Reinforcement Learning PROCEEDINGS OF THE 2023 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, GECCO 2023, 2023, : 624 - 632
- [38] Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
- [39] Stabilizing Off-Policy Deep Reinforcement Learning from Pixels INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,