共 50 条
- [1] Off-Policy Evaluation via Off-Policy Classification ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
- [2] Minimax Value Interval for Off-Policy Evaluation and Policy Optimization ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [3] Off-Policy Evaluation with Policy-Dependent Optimization Response ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [4] Doubly Robust Off-policy Value Evaluation for Reinforcement Learning INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
- [5] A Unifying Framework of Off-Policy General Value Function Evaluation ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
- [6] Universal Off-Policy Evaluation ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
- [7] Off-Policy Evaluation for Human Feedback ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [8] Off-policy evaluation for slate recommendation ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
- [9] High Confidence Off-Policy Evaluation PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 3000 - 3006
- [10] Chaining Value Functions for Off-Policy Learning THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8187 - 8195