共 50 条
- [42] On the effectiveness of reward-based policies: Are we using the proper concept of tax reward? ECONOMICS AND BUSINESS LETTERS, 2022, 11 (01): : 41 - 45
- [44] Reward learning from human preferences and demonstrations in Atari ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
- [45] Public Employees' Performance-based Reward Preferences MALIYE DERGISI, 2015, (168): : 249 - 272
- [46] Learning Optimal Advantage from Preferences and Mistaking It for Reward THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 9, 2024, : 10066 - 10073
- [47] PROBABILITY-REWARD PREFERENCES OF RHESUS-MONKEYS AMERICAN JOURNAL OF PSYCHOLOGY, 1985, 98 (01): : 77 - 84
- [49] Learning Reward Functions by Integrating Human Demonstrations and Preferences ROBOTICS: SCIENCE AND SYSTEMS XV, 2019,
- [50] Multi-Objective POMDPs with Lexicographic Reward Preferences PROCEEDINGS OF THE TWENTY-FOURTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI), 2015, : 1719 - 1725