共 50 条
- [31] Reinforcement online learning to rank with unbiased reward shaping [J]. Information Retrieval Journal, 2022, 25 : 386 - 413
- [32] Maximizing the average reward in episodic reinforcement learning tasks [J]. 2015 INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATICS AND BIOMEDICAL SCIENCES (ICIIBMS), 2015, : 420 - 421
- [33] Structured Reward Shaping using Signal Temporal Logic specifications [J]. 2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 3481 - 3486
- [34] Plan-based reward shaping for multi-agent reinforcement learning [J]. KNOWLEDGE ENGINEERING REVIEW, 2016, 31 (01): : 44 - 58
- [35] Reinforcement Learning With Temporal Logic Rewards [J]. 2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 3834 - 3839
- [36] Reinforcement Learning with Temporal Logic Constraints [J]. IFAC PAPERSONLINE, 2020, 53 (04): : 485 - 492
- [38] Bottom-up multi-agent reinforcement learning by reward shaping for cooperative-competitive tasks [J]. Applied Intelligence, 2021, 51 : 4434 - 4452