AUTOMATIC PROGRAMMING OF BEHAVIOR-BASED ROBOTS USING REINFORCEMENT LEARNING

被引:211
|
作者
MAHADEVAN, S
CONNELL, J
机构
[1] IBM T.J. Watson Research Center, Yorktown Heights, NY 10598
关键词
D O I
10.1016/0004-3702(92)90058-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper describes a general approach for automatically programming a behavior-based robot. New behaviors are learned by trial and error using a performance feedback function as reinforcement. Two algorithms for behavior learning are described that combine Q learning, a well-known scheme for propagating reinforcement values temporally across actions, with statistical clustering and Hamming distance. two ways of propagating reinforcement values spatially across states. A real behavior-based robot called OBELIX is described that learns several component behaviors in an example task involving pushing boxes. A simulator for the box pushing task is also used to gather data on the learning techniques. A detailed experimental study using the real robot and the simulator suggests two conclusions. (1) The learning techniques are able to learn the individual behaviors, sometimes outperforming a handcoded program. (2) Using a behavior-based architecture speeds up reinforcement learning by converting the problem of learning a complex task into that of learning a simpler set of special-purpose reactive subtasks.
引用
收藏
页码:311 / 365
页数:55
相关论文
共 50 条
  • [1] Measuring the effectiveness of reinforcement learning for behavior-based robots
    Shackleton, J
    Gini, M
    [J]. ADAPTIVE BEHAVIOR, 1997, 5 (3-4) : 365 - 390
  • [2] An architecture for behavior-based reinforcement learning
    Konidaris, GD
    Hayes, GM
    [J]. ADAPTIVE BEHAVIOR, 2005, 13 (01) : 5 - 32
  • [3] Learning to coordinate behaviors in soft behavior-based systems using reinforcement learning
    Azar, Mohammad G.
    Ahmadabadi, Majid Nili
    Farahmand, Amir Massoud
    Araabi, Babak Nadjar
    [J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 241 - +
  • [4] A behavior-based scheme using reinforcement learning for autonomous underwater vehicles
    Carreras, M
    Yuh, J
    Batlle, J
    Ridao, P
    [J]. IEEE JOURNAL OF OCEANIC ENGINEERING, 2005, 30 (02) : 416 - 427
  • [5] A Behavior-Based Reinforcement Learning Approach to Control Walking Bipedal Robots Under Unknown Disturbances
    Beranek, Richard
    Karimi, Masoud
    Ahmadi, Mojtaba
    [J]. IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2022, 27 (05) : 2710 - 2720
  • [6] Learning to ground fact symbols in behavior-based robots
    Hertzberg, J
    Jaeger, H
    Schönherr, F
    [J]. ECAI 2002: 15TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 77 : 708 - 712
  • [7] Behavior-based learning fuzzy rules for mobile robots
    Thongchai, S
    [J]. PROCEEDINGS OF THE 2002 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2002, 1-6 : 995 - 1000
  • [8] Music recommender using deep embedding-based features and behavior-based reinforcement learning
    Chang, Jia-Wei
    Chiou, Ching-Yi
    Liao, Jia-Yi
    Hung, Ying-Kai
    Huang, Chien-Che
    Lin, Kuan-Cheng
    Pu, Ying-Hung
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (26-27) : 34037 - 34064
  • [9] Behavior-based Reinforcement Learning Control for Robotic Rehabilitation Training
    Meng, Fancheng
    Fan, Keyan
    [J]. 2015 27TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2015, : 4330 - 4334
  • [10] Music recommender using deep embedding-based features and behavior-based reinforcement learning
    Jia-Wei Chang
    Ching-Yi Chiou
    Jia-Yi Liao
    Ying-Kai Hung
    Chien-Che Huang
    Kuan-Cheng Lin
    Ying-Hung Pu
    [J]. Multimedia Tools and Applications, 2021, 80 : 34037 - 34064