AUTOMATIC PROGRAMMING OF BEHAVIOR-BASED ROBOTS USING REINFORCEMENT LEARNING

被引：211

作者：

MAHADEVAN, S

CONNELL, J

机构：

[1] IBM T.J. Watson Research Center, Yorktown Heights, NY 10598

来源：

ARTIFICIAL INTELLIGENCE | 1992年 / 55卷 / 2-3期

关键词：

D O I：

10.1016/0004-3702(92)90058-6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper describes a general approach for automatically programming a behavior-based robot. New behaviors are learned by trial and error using a performance feedback function as reinforcement. Two algorithms for behavior learning are described that combine Q learning, a well-known scheme for propagating reinforcement values temporally across actions, with statistical clustering and Hamming distance. two ways of propagating reinforcement values spatially across states. A real behavior-based robot called OBELIX is described that learns several component behaviors in an example task involving pushing boxes. A simulator for the box pushing task is also used to gather data on the learning techniques. A detailed experimental study using the real robot and the simulator suggests two conclusions. (1) The learning techniques are able to learn the individual behaviors, sometimes outperforming a handcoded program. (2) Using a behavior-based architecture speeds up reinforcement learning by converting the problem of learning a complex task into that of learning a simpler set of special-purpose reactive subtasks.

引用

页码：311 / 365

页数：55

共 50 条

[1] Measuring the effectiveness of reinforcement learning for behavior-based robots
Shackleton, J
Gini, M
[J]. ADAPTIVE BEHAVIOR, 1997, 5 (3-4) : 365 - 390
[2] An architecture for behavior-based reinforcement learning
Konidaris, GD
Hayes, GM
[J]. ADAPTIVE BEHAVIOR, 2005, 13 (01) : 5 - 32
[3] Learning to coordinate behaviors in soft behavior-based systems using reinforcement learning
Azar, Mohammad G.
Ahmadabadi, Majid Nili
Farahmand, Amir Massoud
Araabi, Babak Nadjar
[J]. 2006 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK PROCEEDINGS, VOLS 1-10, 2006, : 241 - +
[4] A behavior-based scheme using reinforcement learning for autonomous underwater vehicles
Carreras, M
Yuh, J
Batlle, J
Ridao, P
[J]. IEEE JOURNAL OF OCEANIC ENGINEERING, 2005, 30 (02) : 416 - 427
[5] A Behavior-Based Reinforcement Learning Approach to Control Walking Bipedal Robots Under Unknown Disturbances
Beranek, Richard
Karimi, Masoud
Ahmadi, Mojtaba
[J]. IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2022, 27 (05) : 2710 - 2720
[6] Learning to ground fact symbols in behavior-based robots
Hertzberg, J
Jaeger, H
Schönherr, F
[J]. ECAI 2002: 15TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2002, 77 : 708 - 712
[7] Behavior-based learning fuzzy rules for mobile robots
Thongchai, S
[J]. PROCEEDINGS OF THE 2002 AMERICAN CONTROL CONFERENCE, VOLS 1-6, 2002, 1-6 : 995 - 1000
[8] Music recommender using deep embedding-based features and behavior-based reinforcement learning
Chang, Jia-Wei
Chiou, Ching-Yi
Liao, Jia-Yi
Hung, Ying-Kai
Huang, Chien-Che
Lin, Kuan-Cheng
Pu, Ying-Hung
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (26-27) : 34037 - 34064
[9] Behavior-based Reinforcement Learning Control for Robotic Rehabilitation Training
Meng, Fancheng
Fan, Keyan
[J]. 2015 27TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2015, : 4330 - 4334
[10] Music recommender using deep embedding-based features and behavior-based reinforcement learning
Jia-Wei Chang
Ching-Yi Chiou
Jia-Yi Liao
Ying-Kai Hung
Chien-Che Huang
Kuan-Cheng Lin
Ying-Hung Pu
[J]. Multimedia Tools and Applications, 2021, 80 : 34037 - 34064

← 1 2 3 4 5 →