共 50 条
- [33] Fast and stable learning of quasi-passive dynamic walking by an unstable biped robot based on off-policy natural actor-critic 2006 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-12, 2006, : 5226 - +
- [34] Off-policy Learning in Two-stage Recommender Systems WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, : 463 - 473
- [35] Actor-critic algorithms ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 12, 2000, 12 : 1008 - 1014
- [36] On actor-critic algorithms SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2003, 42 (04) : 1143 - 1166
- [39] Optimal Actor-Critic Policy With Optimized Training Datasets IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2022, 6 (06): : 1324 - 1334
- [40] Policy-Gradient Based Actor-Critic Algorithms PROCEEDINGS OF THE 2009 WRI GLOBAL CONGRESS ON INTELLIGENT SYSTEMS, VOL III, 2009, : 505 - 509