共 50 条
- [31] Off-Policy Evaluation in Partially Observable Environments THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 10276 - 10283
- [32] On the Design of Estimators for Bandit Off-Policy Evaluation INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
- [33] Off-Policy Interval Estimation with Lipschitz Value Iteration ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [34] Data Poisoning Attacks on Off-Policy Policy Evaluation Methods UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, VOL 180, 2022, 180 : 1264 - 1274
- [35] Policy-Adaptive Estimator Selection for Off-Policy Evaluation THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 8, 2023, : 10025 - 10033
- [36] Identification of Subgroups With Similar Benefits in Off-Policy Policy Evaluation CONFERENCE ON HEALTH, INFERENCE, AND LEARNING, VOL 174, 2022, 174 : 397 - 410
- [37] Value targets in off-policy AlphaZero: a new greedy backup NEURAL COMPUTING & APPLICATIONS, 2022, 34 (03): : 1801 - 1814
- [38] Optimal and Adaptive Off-policy Evaluation in Contextual Bandits INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
- [39] Value targets in off-policy AlphaZero: a new greedy backup Neural Computing and Applications, 2022, 34 : 1801 - 1814
- [40] Conformal Off-Policy Evaluation in Markov Decision Processes 2023 62ND IEEE CONFERENCE ON DECISION AND CONTROL, CDC, 2023, : 3087 - 3094