Learning Something from Nothing: Leveraging Implicit Human Feedback Strategies

被引:0
|
作者
Loftin, Robert [1 ]
Peng, Bei [2 ]
MacGlashan, James [3 ]
Littman, Michael L. [3 ]
Taylor, Matthew E. [2 ]
Huang, Jeff [3 ]
Roberts, David L. [1 ]
机构
[1] N Carolina State Univ, Dept Comp Sci, Raleigh, NC 27695 USA
[2] Washington State Univ, Sch Elect Engn & Comp Sci, Pullman, WA 99164 USA
[3] Brown Univ, Dept Comp Sci, Providence, RI 02912 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In order to be useful in real-world situations, it is critical to allow non-technical users to train robots. Existing work has considered the problem of a robot or virtual agent learning behaviors from evaluative feedback provided by a human trainer. That work, however, has treated feedback as a numeric reward that the agent seeks to maximize, and has assumed that all trainers will provide feedback in the same way when teaching the same behavior. We report the results of a series of user studies that indicate human trainers use a variety of approaches to providing feedback in practice, which we describe as different "training strategies." For example, users may not always give explicit feedback in response to an action, and may be more likely to provide explicit reward than explicit punishment, or vice versa. If the trainer is consistent in their strategy, then it may be possible to infer knowledge about the desired behavior from cases where no explicit feedback is provided. We discuss a probabilistic model of human-provided feedback that can be used to classify these different training strategies based on when the trainer chooses to provide explicit reward and/or explicit punishment, and when they choose to provide no feedback. Additionally, we investigate how training strategies may change in response to the appearance of the learning agent. Ultimately, based on this work, we argue that learning agents designed to understand and adapt to different users' training strategies will allow more efficient and intuitive learning experiences.
引用
收藏
页码:607 / 612
页数:6
相关论文
共 50 条
  • [31] A UNIVERSE FROM NOTHING Why there is something rather than nothing
    Polkinghorne, John
    TLS-THE TIMES LITERARY SUPPLEMENT, 2013, (5732): : 32 - 32
  • [32] A UNIVERSE FROM NOTHING: Why There Is Something Rather than Nothing
    Murphy, George L.
    PERSPECTIVES ON SCIENCE AND CHRISTIAN FAITH, 2013, 65 (02): : 137 - 138
  • [33] A Universe from Nothing: Why There Is Something Rather than Nothing
    Gelernter, Joshua
    COMMENTARY, 2012, 134 (01) : 83 - 84
  • [34] A Framework for Linear TV Recommendation by Leveraging Implicit Feedback
    Agarwal, Abhishek
    Das, Soumita
    Das, Joydeep
    Majumder, Subhashis
    COMPUTATIONAL SCIENCE AND TECHNOLOGY, 2019, 481 : 155 - 164
  • [35] IMPLICIT PAIN-RELATED FEAR: SOMETHING OR NOTHING? THE RELATION BETWEEN IMPLICIT RESPONDING, MOVEMENT AND DISABILITY
    Kruger, Eric S.
    Vowles, Kevin E.
    ANNALS OF BEHAVIORAL MEDICINE, 2020, 54 : S357 - S357
  • [36] A Universe from Nothing: Why There is Something Rather Than Nothing
    Scharf, Caleb
    NATURE, 2012, 481 (7382) : 440 - 440
  • [37] Learning from Interventions: Human-robot interaction as both explicit and implicit feedback
    Spencer, Jonathan
    Choudhury, Sanjiban
    Barnes, Matt
    Schmittle, Matthew
    Chiang, Mung
    Ramadge, Peter
    Srinivasa, Siddhartha
    ROBOTICS: SCIENCE AND SYSTEMS XVI, 2020,
  • [38] The impact of implicit and explicit suggestions that 'there is nothing to learn' on implicit sequence learning
    Vermeylen, Luc
    Abrahamse, Elger
    Braem, Senne
    Rigoni, Davide
    PSYCHOLOGICAL RESEARCH-PSYCHOLOGISCHE FORSCHUNG, 2021, 85 (05): : 1943 - 1954
  • [39] The impact of implicit and explicit suggestions that ‘there is nothing to learn’ on implicit sequence learning
    Luc Vermeylen
    Elger Abrahamse
    Senne Braem
    Davide Rigoni
    Psychological Research, 2021, 85 : 1943 - 1954
  • [40] Bandit Learning with Implicit Feedback
    Qi, Yi
    Wu, Qingyun
    Wang, Hongning
    Tang, Jie
    Sun, Maosong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31