共 50 条
- [1] Learning with Options that Terminate Off-Policy THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3173 - 3182
- [2] Online Learning with Off-Policy Feedback INTERNATIONAL CONFERENCE ON ALGORITHMIC LEARNING THEORY, VOL 201, 2023, 201 : 620 - 641
- [3] Off-policy Learning for Multiple Loggers KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 1184 - 1193
- [4] Exponential Smoothing for Off-Policy Learning INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
- [5] Off-Policy Evaluation via Off-Policy Classification ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
- [6] Learning Routines for Effective Off-Policy Reinforcement Learning INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
- [7] Safe and efficient off-policy reinforcement learning ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
- [8] Conditional Importance Sampling for Off-Policy Learning INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 45 - 54
- [9] Chaining Value Functions for Off-Policy Learning THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 8187 - 8195