共 50 条
- [41] Average-Reward Off-Policy Policy Evaluation with Function Approximation INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
- [42] A Nonparametric Off-Policy Policy Gradient INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
- [43] Boosted Off-Policy Learning INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206
- [44] Supervised Off-Policy Ranking INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022, : 10323 - 10339
- [45] Q(λ) with Off-Policy Corrections ALGORITHMIC LEARNING THEORY, (ALT 2016), 2016, 9925 : 305 - 320
- [46] On the Relation between Policy Improvement and Off-Policy Minimum-Variance Policy Evaluation UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1423 - 1433
- [47] Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
- [48] Off-Policy Evaluation with Deficient Support Using Side Information ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [49] Off-policy evaluation for tabular reinforcement learning with synthetic trajectories Statistics and Computing, 2024, 34
- [50] Bootstrapping Fitted Q-Evaluation for Off-Policy Inference INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139