共 50 条
- [32] Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [34] Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus [J]. 2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 4674 - 4679
- [35] Exploring the Use of Invalid Action Masking in Reinforcement Learning: A Comparative Study of On-Policy and Off-Policy Algorithms in Real-Time Strategy Games [J]. APPLIED SCIENCES-BASEL, 2023, 13 (14):
- [36] Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3647 - 3655
- [37] Safe Off-policy Reinforcement Learning Using Barrier Functions [J]. 2020 AMERICAN CONTROL CONFERENCE (ACC), 2020, : 2176 - 2181
- [38] Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
- [39] Regret Minimization Experience Replay in Off-Policy Reinforcement Learning [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
- [40] Off-policy evaluation for tabular reinforcement learning with synthetic trajectories [J]. Statistics and Computing, 2024, 34