A Survey of Robot Intelligence with Large Language Models

被引:3
|
作者
Jeong, Hyeongyo [1 ]
Lee, Haechan [1 ]
Kim, Changwon [2 ]
Shin, Sungtae [1 ]
机构
[1] Dong A Univ, Dept Mech Engn, Busan 49315, South Korea
[2] Pukyong Natl Univ, Sch Mech Engn, Busan 48513, South Korea
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 19期
基金
新加坡国家研究基金会;
关键词
embodied intelligence; foundation model; large language model (LLM); vision-language model (VLM); vision-language-action (VLA) model; robotics;
D O I
10.3390/app14198868
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Since the emergence of ChatGPT, research on large language models (LLMs) has actively progressed across various fields. LLMs, pre-trained on vast text datasets, have exhibited exceptional abilities in understanding natural language and planning tasks. These abilities of LLMs are promising in robotics. In general, traditional supervised learning-based robot intelligence systems have a significant lack of adaptability to dynamically changing environments. However, LLMs help a robot intelligence system to improve its generalization ability in dynamic and complex real-world environments. Indeed, findings from ongoing robotics studies indicate that LLMs can significantly improve robots' behavior planning and execution capabilities. Additionally, vision-language models (VLMs), trained on extensive visual and linguistic data for the vision question answering (VQA) problem, excel at integrating computer vision with natural language processing. VLMs can comprehend visual contexts and execute actions through natural language. They also provide descriptions of scenes in natural language. Several studies have explored the enhancement of robot intelligence using multimodal data, including object recognition and description by VLMs, along with the execution of language-driven commands integrated with visual information. This review paper thoroughly investigates how foundation models such as LLMs and VLMs have been employed to boost robot intelligence. For clarity, the research areas are categorized into five topics: reward design in reinforcement learning, low-level control, high-level planning, manipulation, and scene understanding. This review also summarizes studies that show how foundation models, such as the Eureka model for automating reward function design in reinforcement learning, RT-2 for integrating visual data, language, and robot actions in vision-language-action models, and AutoRT for generating feasible tasks and executing robot behavior policies via LLMs, have improved robot intelligence.
引用
收藏
页数:39
相关论文
共 50 条
  • [21] A survey of datasets in medicine for large language models
    Zhang, Deshiwei
    Xue, Xiaojuan
    Gao, Peng
    Jin, Zhijuan
    Hu, Menghan
    Wu, Yue
    Ying, Xiayang
    INTELLIGENCE & ROBOTICS, 2024, 4 (04): : 457 - 478
  • [22] A survey of table reasoning with large language models
    Zhang, Xuanliang
    Wang, Dingzirui
    Dou, Longxu
    Zhu, Qingfu
    Che, Wanxiang
    FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (09)
  • [23] Knowledge Editing for Large Language Models: A Survey
    Wang, Song
    Zhu, Yaochen
    Liu, Haochen
    Zheng, Zaiyi
    Chen, Chen
    Li, Jundong
    ACM COMPUTING SURVEYS, 2025, 57 (03)
  • [24] Bias and Fairness in Large Language Models: A Survey
    Gallegos, Isabel O.
    Rossi, Ryan A.
    Barrow, Joe
    Tanjim, Md Mehrab
    Kim, Sungchul
    Dernoncourt, Franck
    Yu, Tong
    Zhang, Ruiyi
    Ahmed, Nesreen K.
    COMPUTATIONAL LINGUISTICS, 2024, 50 (03) : 1097 - 1179
  • [25] Tool learning with large language models: a survey
    Qu, Changle
    Dai, Sunhao
    Wei, Xiaochi
    Cai, Hengyi
    Wang, Shuaiqiang
    Yin, Dawei
    Xu, Jun
    Wen, Ji-rong
    FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (08)
  • [26] Towards Reasoning in Large Language Models: A Survey
    Huang, Jie
    Chang, Kevin Chen-Chuan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1049 - 1065
  • [27] Large Language Models on Graphs: A Comprehensive Survey
    Jin, Bowen
    Liu, Gang
    Han, Chi
    Jiang, Meng
    Ji, Heng
    Han, Jiawei
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 8622 - 8642
  • [28] A Survey on Hardware Accelerators for Large Language Models
    Kachris, Christoforos
    APPLIED SCIENCES-BASEL, 2025, 15 (02):
  • [29] Artificial intelligence: revolutionizing cardiology with large language models
    Boonstra, Machteld
    Weissenbacher, Davy
    Moore, Jason
    Gonzalez-Hernandez, Graciela
    Asselbergs, Folkert
    EUROPEAN HEART JOURNAL, 2024, 45 (05) : 332 - 345
  • [30] Artificial intelligence: Augmenting telehealth with large language models
    Snoswell, Centaine L.
    Snoswell, Aaron J.
    Kelly, Jaimon T.
    Caffery, Liam J.
    Smith, Anthony C.
    JOURNAL OF TELEMEDICINE AND TELECARE, 2025, 31 (01) : 150 - 154