Value Driven Representation for Human-in-the-Loop Reinforcement Learning

被引:1
|
作者
Keramati, Ramtin [1 ]
Brunskill, Emma [2 ]
机构
[1] Inst Computat & Math Engn, Stanford, CA 94305 USA
[2] Dept Comp Sci, Stanford, CA USA
关键词
Reinforcement Learning; Human-in-the-Loop;
D O I
10.1145/3320435.3320471
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Interactive adaptive systems powered by Reinforcement Learning (RL) have many potential applications, such as intelligent tutoring systems. In such systems there is typically an external human system designer that is creating, monitoring and modifying the interactive adaptive system, trying to improve its performance on the target outcomes. In this paper we focus on algorithmic foundation of how to help the system designer choose the set of sensors or features to define the observation space used by reinforcement learning agent. We present an algorithm, value driven representation (VDR), that can iteratively and adaptively augment the observation space of a reinforcement learning agent so that is sufficient to capture a (near) optimal policy. To do so we introduce a new method to optimistically estimate the value of a policy using offline simulated Monte Carlo rollouts. We evaluate the performance of our approach on standard RL benchmarks with simulated humans and demonstrate significant improvement over prior baselines.
引用
收藏
页码:176 / 180
页数:5
相关论文
共 50 条
  • [1] Human-in-the-loop Reinforcement Learning
    Liang, Huanghuang
    Yang, Lu
    Cheng, Hong
    Tu, Wenzhe
    Xu, Mengjie
    [J]. 2017 CHINESE AUTOMATION CONGRESS (CAC), 2017, : 4511 - 4518
  • [2] Where to Add Actions in Human-in-the-Loop Reinforcement Learning
    Mandel, Travis
    Liu, Yun-En
    Brunskill, Emma
    Popovic, Zoran
    [J]. THIRTY-FIRST AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2017, : 2322 - 2328
  • [3] Reinforcement Learning Requires Human-in-the-Loop Framing and Approaches
    Taylor, Matthew E.
    [J]. HHAI 2023: AUGMENTING HUMAN INTELLECT, 2023, 368 : 351 - 360
  • [4] ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning
    Chen, Sean
    Gao, Jensen
    Reddy, Siddharth
    Berseth, Glen
    Dragan, Anca D.
    Levine, Sergey
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 7505 - 7512
  • [5] HEX: Human-in-the-loop explainability via deep reinforcement learning
    Lash, Michael T.
    [J]. Decision Support Systems, 2024, 187
  • [6] Human-in-the-Loop Reinforcement Learning in Continuous-Action Space
    Luo, Biao
    Wu, Zhengke
    Zhou, Fei
    Wang, Bing-Chuan
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (11) : 1 - 10
  • [7] Shared Autonomy Based on Human-in-the-loop Reinforcement Learning with Policy Constraints
    Li, Ming
    Kang, Yu
    Zhao, Yun-Bo
    Zhu, Jin
    You, Shiyi
    [J]. 2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 7349 - 7354
  • [8] Personalization of Hearing Aid Compression by Human-in-the-Loop Deep Reinforcement Learning
    Alamdari, Nasim
    Lobarinas, Edward
    Kehtarnavaz, Nasser
    [J]. IEEE ACCESS, 2020, 8 : 203503 - 203515
  • [9] Thermal comfort management leveraging deep reinforcement learning and human-in-the-loop
    Cicirelli, Franco
    Guerrieri, Antonio
    Mastroianni, Carlo
    Spezzano, Giandomenico
    Vinci, Andrea
    [J]. PROCEEDINGS OF THE 2020 IEEE INTERNATIONAL CONFERENCE ON HUMAN-MACHINE SYSTEMS (ICHMS), 2020, : 160 - 165
  • [10] Human-in-the-Loop Reinforcement Learning: A Survey and Position on Requirements, Challenges, and Opportunities
    Retzlaff, Carl Orge
    Das, Srijita
    Wayllace, Christabel
    Mousavi, Payam
    Afshari, Mohammad
    Yang, Tianpei
    Saranti, Anna
    Angerschmid, Alessa
    Taylor, Matthew E.
    Holzinger, Andreas
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 79 : 359 - 415