共 50 条
- [21] Flexible Data Augmentation in Off-Policy Reinforcement Learning [J]. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING (ICAISC 2021), PT I, 2021, 12854 : 224 - 235
- [22] Off-Policy Deep Reinforcement Learning without Exploration [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
- [24] Research on Off-Policy Evaluation in Reinforcement Learning: A Survey [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2022, 45 (09): : 1926 - 1945
- [26] Unifying Gradient Estimators for Meta-Reinforcement Learning via Off-Policy Evaluation [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,
- [27] Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
- [28] REINFORCEMENT LEARNING FOR SPOKEN DIALOGUE SYSTEMS USING OFF-POLICY NATURAL GRADIENT METHOD [J]. 2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 7 - 12
- [29] Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus [J]. 2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 4674 - 4679
- [30] Off-Policy Deep Reinforcement Learning by Bootstrapping the Covariate Shift [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 3647 - 3655