A teaching strategy for memory-based control

被引：12

作者：

Sheppard, JW

Salzberg, SL

机构：

[1] The Johns Hopkins University,Department of Computer Science

来源：

ARTIFICIAL INTELLIGENCE REVIEW | 1997年 / 11卷 / 1-5期

关键词：

lazy learning; nearest neighbor; genetic algorithms; differential games; pursuit games; teaching; reinforcement learning;

D O I：

10.1023/A:1006597715165

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Combining different machine learning algorithms in the same system can produce benefits above and beyond what either method could achieve alone. This paper demonstrates that genetic algorithms can be used in conjunction with lazy learning to solve examples of a difficult class of delayed reinforcement learning problems better than either method alone. This class, the class of differential games, includes numerous important control problems that arise in robotics, planning, game playing, and other areas, and solutions for differential games suggest solution strategies for the general class of planning and control problems. We conducted a series of experiments applying three learning approaches - lazy Q-learning, k-nearest neighbor (k-NN), and a genetic algorithm - to a particular differential game called a pursuit game. Our experiments demonstrate that Ic-NN had great difficulty solving the problem, while a lazy version of Q-learning performed moderately well and the genetic algorithm performed even better. These results motivated the next step in the experiments, where we hypothesized Ic-NN was having difficulty because it did not have good examples - a common source of difficulty for lazy learning. Therefore, we used the genetic algorithm as a bootstrapping method for Ic-NN to create a system to provide these examples. Our experiments demonstrate that the resulting joint system learned to solve the pursuit games with a high degree of accuracy outperforming either method alone - and with relatively small memory requirements.

引用

页码：343 / 370

页数：28

共 50 条

[41] Artificial memory-based optimization
[J]. 1600, Systems Engineering Society of China (34):
[42] Memory-based control of nonlinear dynamic systems Part II - Applications
Song, Y. D.
Sun, Zhao
Liao, X. H.
Zhang, R.
[J]. 2006 1ST IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-3, 2006, : 1588 - +
[43] FUZZY VISUAL CONTROL FOR MEMORY-BASED NAVIGATION USING THE TRIFOCAL TENSOR
Becerra, Hector M.
[J]. INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2014, 20 (02): : 245 - 262
[44] Memory-based robust adaptive control of a variable length stepping nanomanipulator
Saeldpourazar, Reza
Jalili, Nader
[J]. NANOSENSORS, MICROSENSORS, AND BIOSENSORS AND SYSTEMS 2007, 2007, 6528
[45] Learning Memory-Based Control for Human-Scale Bipedal Locomotion
Siekmann, Jonah
Valluri, Srikar
Dao, Jeremy
Bermillo, Lorenzo
Duan, Helei
Fern, Alan
Hurst, Jonathan
[J]. ROBOTICS: SCIENCE AND SYSTEMS XVI, 2020,
[46] Memory-based control of nonlinear dynamic systems - Part II - Applications
Song, Y. D.
Sun, Zhao
Liao, X. H.
Zhang, R.
[J]. ICIEA 2006: 1ST IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS, VOLS 1-3, PROCEEDINGS, 2006, : 58 - 63
[47] Design of a memory-based prefilter supplementing a robust PID control system
Takao, Kenji
Yamamoto, Toru
Hinamoto, Takao
[J]. ASIAN JOURNAL OF CONTROL, 2008, 10 (03) : 301 - 313
[48] On-Line versus Memory-based Information Credibility Inferences: Implications for Memory-based Product Judgments
Pandelaere, Mario
Dewitte, Siegfried
[J]. ADVANCES IN CONSUMER RESEARCH, VOL 33, 2006, 33 : 565 - 568
[49] Memory-Based Explainable Reinforcement Learning
Cruz, Francisco
Dazeley, Richard
Vamplew, Peter
[J]. AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 66 - 77
[50] Memory-Based Antiforensic Tools and Techniques
Jahankhani, Hamid
Beqiri, Elidon
[J]. INTERNATIONAL JOURNAL OF INFORMATION SECURITY AND PRIVACY, 2008, 2 (02) : 1 - 13

← 1 2 3 4 5 →