A Survey of Robot Intelligence with Large Language Models

被引：3

作者：

Jeong, Hyeongyo ^{[1
]}

Lee, Haechan ^{[1
]}

Kim, Changwon ^{[2
]}

Shin, Sungtae ^{[1
]}

机构：

[1] Dong A Univ, Dept Mech Engn, Busan 49315, South Korea

[2] Pukyong Natl Univ, Sch Mech Engn, Busan 48513, South Korea

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 19期

基金：

新加坡国家研究基金会;

关键词：

embodied intelligence; foundation model; large language model (LLM); vision-language model (VLM); vision-language-action (VLA) model; robotics;

D O I：

10.3390/app14198868

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Since the emergence of ChatGPT, research on large language models (LLMs) has actively progressed across various fields. LLMs, pre-trained on vast text datasets, have exhibited exceptional abilities in understanding natural language and planning tasks. These abilities of LLMs are promising in robotics. In general, traditional supervised learning-based robot intelligence systems have a significant lack of adaptability to dynamically changing environments. However, LLMs help a robot intelligence system to improve its generalization ability in dynamic and complex real-world environments. Indeed, findings from ongoing robotics studies indicate that LLMs can significantly improve robots' behavior planning and execution capabilities. Additionally, vision-language models (VLMs), trained on extensive visual and linguistic data for the vision question answering (VQA) problem, excel at integrating computer vision with natural language processing. VLMs can comprehend visual contexts and execute actions through natural language. They also provide descriptions of scenes in natural language. Several studies have explored the enhancement of robot intelligence using multimodal data, including object recognition and description by VLMs, along with the execution of language-driven commands integrated with visual information. This review paper thoroughly investigates how foundation models such as LLMs and VLMs have been employed to boost robot intelligence. For clarity, the research areas are categorized into five topics: reward design in reinforcement learning, low-level control, high-level planning, manipulation, and scene understanding. This review also summarizes studies that show how foundation models, such as the Eureka model for automating reward function design in reinforcement learning, RT-2 for integrating visual data, language, and robot actions in vision-language-action models, and AutoRT for generating feasible tasks and executing robot behavior policies via LLMs, have improved robot intelligence.

引用

页数：39

共 50 条

[21] A survey of datasets in medicine for large language models
Zhang, Deshiwei
Xue, Xiaojuan
Gao, Peng
Jin, Zhijuan
Hu, Menghan
Wu, Yue
Ying, Xiayang
INTELLIGENCE & ROBOTICS, 2024, 4 (04): : 457 - 478
[22] A survey of table reasoning with large language models
Zhang, Xuanliang
Wang, Dingzirui
Dou, Longxu
Zhu, Qingfu
Che, Wanxiang
FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (09)
[23] Knowledge Editing for Large Language Models: A Survey
Wang, Song
Zhu, Yaochen
Liu, Haochen
Zheng, Zaiyi
Chen, Chen
Li, Jundong
ACM COMPUTING SURVEYS, 2025, 57 (03)
[24] Bias and Fairness in Large Language Models: A Survey
Gallegos, Isabel O.
Rossi, Ryan A.
Barrow, Joe
Tanjim, Md Mehrab
Kim, Sungchul
Dernoncourt, Franck
Yu, Tong
Zhang, Ruiyi
Ahmed, Nesreen K.
COMPUTATIONAL LINGUISTICS, 2024, 50 (03) : 1097 - 1179
[25] Tool learning with large language models: a survey
Qu, Changle
Dai, Sunhao
Wei, Xiaochi
Cai, Hengyi
Wang, Shuaiqiang
Yin, Dawei
Xu, Jun
Wen, Ji-rong
FRONTIERS OF COMPUTER SCIENCE, 2025, 19 (08)
[26] Towards Reasoning in Large Language Models: A Survey
Huang, Jie
Chang, Kevin Chen-Chuan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 1049 - 1065
[27] Large Language Models on Graphs: A Comprehensive Survey
Jin, Bowen
Liu, Gang
Han, Chi
Jiang, Meng
Ji, Heng
Han, Jiawei
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (12) : 8622 - 8642
[28] A Survey on Hardware Accelerators for Large Language Models
Kachris, Christoforos
APPLIED SCIENCES-BASEL, 2025, 15 (02):
[29] Artificial intelligence: revolutionizing cardiology with large language models
Boonstra, Machteld
Weissenbacher, Davy
Moore, Jason
Gonzalez-Hernandez, Graciela
Asselbergs, Folkert
EUROPEAN HEART JOURNAL, 2024, 45 (05) : 332 - 345
[30] Artificial intelligence: Augmenting telehealth with large language models
Snoswell, Centaine L.
Snoswell, Aaron J.
Kelly, Jaimon T.
Caffery, Liam J.
Smith, Anthony C.
JOURNAL OF TELEMEDICINE AND TELECARE, 2025, 31 (01) : 150 - 154

← 1 2 3 4 5 →