Large Language Models Are Semi-Parametric Reinforcement Learning Agents

被引:0
|
作者
Zhang, Danyang [1 ]
Chen, Lu [1 ,2 ]
Zhang, Situo [1 ]
Xu, Hongshen [1 ]
Zhao, Zihan [1 ]
Yu, Kai [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, SJTU AI Inst, X LANCE Lab, Dept Comp Sci & Engn,MoE Key Lab Artificial Intel, Shanghai, Peoples R China
[2] Suzhou Lab, Suzhou, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inspired by the insights in cognitive science with respect to human memory and reasoning mechanism, a novel evolvable LLM-based (Large Language Model) agent framework is proposed as REMEMBERER. By equipping the LLM with a longterm experience memory, REMEMBERER is capable of exploiting the experiences from the past episodes even for different task goals, which excels an LLM-based agent with fixed exemplars or equipped with a transient working memory. We further introduce Reinforcement Learning with Experience Memory (RLEM) to update the memory. Thus, the whole system can learn from the experiences of both success and failure, and evolve its capability without fine-tuning the parameters of the LLM. In this way, the proposed REMEMBERER constitutes a semi-parametric RL agent. Extensive experiments are conducted on two RL task sets to evaluate the proposed framework. The average results with different initialization and training sets exceed the prior SOTA by 4% and 2% for the success rate on two task sets and demonstrate the superiority and robustness of REMEMBERER.
引用
收藏
页数:13
相关论文
共 50 条
  • [41] Semi-parametric and Parametric Inference of Extreme Value Models for Rainfall Data
    Amir AghaKouchak
    Nasrin Nasrollahi
    Water Resources Management, 2010, 24 : 1229 - 1249
  • [42] Semi-Parametric Models for Negative Binomial Panel Data
    Sutradhar, Brajendra C.
    Jowaheer, Vandna
    Rao, R. Prabhakar
    SANKHYA-SERIES A-MATHEMATICAL STATISTICS AND PROBABILITY, 2016, 78 (02): : 269 - 303
  • [43] Semi-parametric specification tests for discrete probability models
    Fang, Y
    JOURNAL OF RISK AND INSURANCE, 2003, 70 (01) : 73 - 84
  • [44] Nonparametric estimation in semi-parametric univariate mixture models
    Cruz-Medina, IR
    Hettmansperger, TP
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2004, 74 (07) : 513 - 524
  • [45] Bayesian analysis of generalized elliptical semi-parametric models
    Rondon, Luz Marina
    Bolfarine, Heleno
    JOURNAL OF APPLIED STATISTICS, 2016, 43 (08) : 1508 - 1524
  • [46] Semi-Parametric Models for Negative Binomial Panel Data
    Sutradhar B.C.
    Jowaheer V.
    Rao R.P.
    Sankhya A, 2016, 78 (2): : 269 - 303
  • [47] Semi-Parametric Efficient Policy Learning with Continuous Actions
    Demirer, Mert
    Syrgkanis, Vasilis
    Lewis, Greg
    Chernozhukov, Victor
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [48] Learning from Biased Data: A Semi-Parametric Approach
    Bertail, Patrice
    Clemencon, Stephan
    Guyonvarch, Yannick
    Noiry, Nathan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [49] Semi-parametric learning of structured temporal point processes
    Xu, Ganggang
    Wang, Ming
    Bian, Jiangze
    Huang, Hui
    Burch, Timothy R.
    Andrade, Sandro C.
    Zhang, Jingfei
    Guan, Yongtao
    Journal of Machine Learning Research, 2020, 21
  • [50] Online Simultaneous Semi-Parametric Dynamics Model Learning
    Smith, Joshua
    Mistry, Michael
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02): : 2039 - 2046