Large Language Models Are Semi-Parametric Reinforcement Learning Agents

被引:0
|
作者
Zhang, Danyang [1 ]
Chen, Lu [1 ,2 ]
Zhang, Situo [1 ]
Xu, Hongshen [1 ]
Zhao, Zihan [1 ]
Yu, Kai [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, SJTU AI Inst, X LANCE Lab, Dept Comp Sci & Engn,MoE Key Lab Artificial Intel, Shanghai, Peoples R China
[2] Suzhou Lab, Suzhou, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inspired by the insights in cognitive science with respect to human memory and reasoning mechanism, a novel evolvable LLM-based (Large Language Model) agent framework is proposed as REMEMBERER. By equipping the LLM with a longterm experience memory, REMEMBERER is capable of exploiting the experiences from the past episodes even for different task goals, which excels an LLM-based agent with fixed exemplars or equipped with a transient working memory. We further introduce Reinforcement Learning with Experience Memory (RLEM) to update the memory. Thus, the whole system can learn from the experiences of both success and failure, and evolve its capability without fine-tuning the parameters of the LLM. In this way, the proposed REMEMBERER constitutes a semi-parametric RL agent. Extensive experiments are conducted on two RL task sets to evaluate the proposed framework. The average results with different initialization and training sets exceed the prior SOTA by 4% and 2% for the success rate on two task sets and demonstrate the superiority and robustness of REMEMBERER.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Reinforcement Learning Method Based on Semi-parametric Regression Model
    Cheng, Yuhu
    Wang, Xuesong
    Tian, Xilan
    2010 CHINESE CONTROL AND DECISION CONFERENCE, VOLS 1-5, 2010, : 11 - 15
  • [2] Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks
    Wang, Zhenhailong
    Pan, Xiaoman
    Yu, Dian
    Yu, Dong
    Chen, Jianshu
    Ji, Heng
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3978 - 4004
  • [3] Semi-parametric estimation for ARCH models
    Alzghool, Raed
    Al-Zubi, Loai M.
    ALEXANDRIA ENGINEERING JOURNAL, 2018, 57 (01) : 367 - 373
  • [4] Variable selection in semi-parametric models
    Zhang, Hongmei
    Maity, Arnab
    Arshad, Hasan
    Holloway, John
    Karmaus, Wilfried
    STATISTICAL METHODS IN MEDICAL RESEARCH, 2016, 25 (04) : 1736 - 1752
  • [5] Semi-parametric adjustment to computer models
    Wang, Yan
    Tuo, Rui
    STATISTICS, 2020, 54 (06) : 1255 - 1275
  • [6] Observed information in semi-parametric models
    Murphy, SA
    Van der Vaart, AW
    BERNOULLI, 1999, 5 (03) : 381 - 412
  • [7] Semi-parametric Models for Visual Odometry
    Guizilini, Vitor
    Ramos, Fabio
    2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2012, : 3482 - 3489
  • [8] Semi-Parametric Models - An Application in Medicine
    Pereira, J. A.
    Pereira, A. L.
    Oliveira, T. A.
    INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS ICNAAM 2019, 2020, 2293
  • [9] Validation tests for semi-parametric models
    Meintanis, Simos G.
    Einbeck, Jochen
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2015, 85 (01) : 131 - 146
  • [10] Hyperbolic and semi-parametric models in finance
    Bingham, NH
    Kiesel, R
    DISORDERED AND COMPLEX SYSTEMS, 2001, 553 : 275 - 280