Large Language Models Are Semi-Parametric Reinforcement Learning Agents

被引：0

作者：

Zhang, Danyang ^{[1
]}

Chen, Lu ^{[1
,2
]}

Zhang, Situo ^{[1
]}

Xu, Hongshen ^{[1
]}

Zhao, Zihan ^{[1
]}

Yu, Kai ^{[1
,2
]}

机构：

[1] Shanghai Jiao Tong Univ, SJTU AI Inst, X LANCE Lab, Dept Comp Sci & Engn,MoE Key Lab Artificial Intel, Shanghai, Peoples R China

[2] Suzhou Lab, Suzhou, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Inspired by the insights in cognitive science with respect to human memory and reasoning mechanism, a novel evolvable LLM-based (Large Language Model) agent framework is proposed as REMEMBERER. By equipping the LLM with a longterm experience memory, REMEMBERER is capable of exploiting the experiences from the past episodes even for different task goals, which excels an LLM-based agent with fixed exemplars or equipped with a transient working memory. We further introduce Reinforcement Learning with Experience Memory (RLEM) to update the memory. Thus, the whole system can learn from the experiences of both success and failure, and evolve its capability without fine-tuning the parameters of the LLM. In this way, the proposed REMEMBERER constitutes a semi-parametric RL agent. Extensive experiments are conducted on two RL task sets to evaluate the proposed framework. The average results with different initialization and training sets exceed the prior SOTA by 4% and 2% for the success rate on two task sets and demonstrate the superiority and robustness of REMEMBERER.

引用

页数：13

共 50 条

[1] Reinforcement Learning Method Based on Semi-parametric Regression Model
Cheng, Yuhu
Wang, Xuesong
Tian, Xilan
2010 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-5, 2010, : 11 - 15
[2] Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks
Wang, Zhenhailong
Pan, Xiaoman
Yu, Dian
Yu, Dong
Chen, Jianshu
Ji, Heng
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3978 - 4004
[3] Semi-parametric estimation for ARCH models
Alzghool, Raed
Al-Zubi, Loai M.
ALEXANDRIA ENGINEERING JOURNAL, 2018, 57 (01) : 367 - 373
[4] Variable selection in semi-parametric models
Zhang, Hongmei
Maity, Arnab
Arshad, Hasan
Holloway, John
Karmaus, Wilfried
STATISTICAL METHODS IN MEDICAL RESEARCH, 2016, 25 (04) : 1736 - 1752
[5] Semi-parametric adjustment to computer models
Wang, Yan
Tuo, Rui
STATISTICS, 2020, 54 (06) : 1255 - 1275
[6] Observed information in semi-parametric models
Murphy, SA
Van der Vaart, AW
BERNOULLI, 1999, 5 (03) : 381 - 412
[7] Semi-parametric Models for Visual Odometry
Guizilini, Vitor
Ramos, Fabio
2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2012, : 3482 - 3489
[8] Semi-Parametric Models - An Application in Medicine
Pereira, J. A.
Pereira, A. L.
Oliveira, T. A.
INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS ICNAAM 2019, 2020, 2293
[9] Validation tests for semi-parametric models
Meintanis, Simos G.
Einbeck, Jochen
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2015, 85 (01) : 131 - 146
[10] Hyperbolic and semi-parametric models in finance
Bingham, NH
Kiesel, R
DISORDERED AND COMPLEX SYSTEMS, 2001, 553 : 275 - 280

← 1 2 3 4 5 →