Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning

被引:0
|
作者
Carta, Thomas [1 ]
Romac, Clement [1 ,2 ]
Wolf, Thomas [2 ]
Lamprier, Sylvain [3 ]
Sigaud, Olivier [4 ]
Oudeyer, Pierre-Yves [1 ]
机构
[1] Univ Bordeaux, Inria Flowers, Bordeaux, France
[2] Hugging Face, Paris, France
[3] Univ Angers, LERIA, SFR MATHSTIC, F-49000 Angers, France
[4] Sorbonne Univ, ISIR, Paris, France
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent works successfully leveraged Large Language Models' (LLM) abilities to capture abstract knowledge about world's physics to solve decision-making problems. Yet, the alignment between LLMs' knowledge and the environment can be wrong and limit functional competence due to lack of grounding. In this paper, we study an approach (named GLAM) to achieve this alignment through functional grounding: we consider an agent using an LLM as a policy that is progressively updated as the agent interacts with the environment, leveraging online Reinforcement Learning to improve its performance to solve goals. Using an interactive textual environment designed to study higher-level forms of functional grounding, and a set of spatial and navigation tasks, we study several scientific questions: 1) Can LLMs boost sample efficiency for online learning of various RL tasks? 2) How can it boost different forms of generalization? 3) What is the impact of online learning? We study these questions by functionally grounding several variants (size, architecture) of FLAN-T5.
引用
收藏
页数:38
相关论文
共 50 条
  • [1] Grounding Language for Transfer in Deep Reinforcement Learning
    Narasimhan, Karthik
    Barzilay, Regina
    Jaakkola, Tommi
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2018, 63 : 849 - 874
  • [2] Symbols and grounding in large language models
    Pavlick, Ellie
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2023, 381 (2251):
  • [3] Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning
    Hanjie, Austin W.
    Zhong, Victor
    Narasimhan, Karthik
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [4] Leveraging Language Models and Automatic Summarization in Online Programming Learning Environments
    Areces, Carlos
    Benotti, Luciana
    Bulgarelli, Franco
    Echeveste, Emilia
    Finzi, Nadia
    COMMUNICATIONS OF THE ACM, 2024, 67 (08) : 86 - 87
  • [5] DrugGen enhances drug discovery with large language models and reinforcement learning
    Mahsa Sheikholeslami
    Navid Mazrouei
    Yousof Gheisari
    Afshin Fasihi
    Matin Irajpour
    Ali Motahharynia
    Scientific Reports, 15 (1)
  • [6] Large Language Models Are Semi-Parametric Reinforcement Learning Agents
    Zhang, Danyang
    Chen, Lu
    Zhang, Situo
    Xu, Hongshen
    Zhao, Zihan
    Yu, Kai
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [7] Reinforcement Learning With Large Language Models (LLMs) Interaction For Network Services
    Du, Hongyang
    Zhang, Ruichen
    Niyato, Dusit
    Kang, Jiawen
    Xiong, Zehui
    Kim, Dong In
    2024 INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKING AND COMMUNICATIONS, ICNC, 2024, : 799 - 803
  • [8] A comprehensive review of large language models: issues and solutions in learning environments
    Shahzad, Tariq
    Mazhar, Tehseen
    Tariq, Muhammad Usman
    Ahmad, Wasim
    Ouahada, Khmaies
    Hamam, Habib
    DISCOVER SUSTAINABILITY, 2025, 6 (01):
  • [9] INTERACTIVE MULTIMEDIA APPLICATION FOR ONLINE LANGUAGE LEARNING
    Scurtu, Veronica
    Buzuloiu, Vasile
    UNIVERSITY POLITEHNICA OF BUCHAREST SCIENTIFIC BULLETIN SERIES C-ELECTRICAL ENGINEERING AND COMPUTER SCIENCE, 2012, 74 (03): : 165 - 176
  • [10] Interactive multimedia application for online language learning
    Scurtu, Veronica
    Buzuloiu, Vasile
    UPB Scientific Bulletin, Series C: Electrical Engineering, 2012, 74 (03): : 165 - 176