共 50 条
- [1] Generalized Off-Policy Actor-Critic [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
- [2] Distributed Actor-Critic Learning Using Emphatic Weightings [J]. 2022 8TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT'22), 2022, : 1167 - 1172
- [3] Off-Policy Actor-critic for Recommender Systems [J]. PROCEEDINGS OF THE 16TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2022, 2022, : 338 - 349
- [5] Noisy Importance Sampling Actor-Critic: An Off-Policy Actor-Critic With Experience Replay [J]. 2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
- [6] Variance Penalized On-Policy and Off-Policy Actor-Critic [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 7899 - 7907
- [7] Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
- [8] Distributed off-Policy Actor-Critic Reinforcement Learning with Policy Consensus [J]. 2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC), 2019, : 4674 - 4679
- [9] Online Meta-Critic Learning for Off-Policy Actor-Critic Methods [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
- [10] An Off-policy Policy Gradient Theorem Using Emphatic Weightings [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31