When Robots Get Chatty: Grounding Multimodal Human-Robot Conversation and Collaboration

被引：0

作者：

Allgeuer, Philipp ^{[1
]}

Ali, Hassan ^{[1
]}

Wermter, Stefan ^{[1
]}

机构：

[1] Univ Hamburg, Dept Informat, Knowledge Technol, Hamburg, Germany

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT IV | 2024年 / 15019卷

关键词：

Natural Dialog for Robots; LLM Grounding; AI-Enabled Robotics; Multimodal Interaction;

D O I：

10.1007/978-3-031-72341-4_21

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We investigate the use of Large Language Models (LLMs) to equip neural robotic agents with human-like social and cognitive competencies, for the purpose of open-ended human-robot conversation and collaboration. We introduce a modular and extensible methodology for grounding an LLM with the sensory perceptions and capabilities of a physical robot, and integrate multiple deep learning models throughout the architecture in a form of system integration. The integrated models encompass various functions such as speech recognition, speech generation, open-vocabulary object detection, human pose estimation, and gesture detection, with the LLM serving as the central text-based coordinating unit. The qualitative and quantitative results demonstrate the huge potential of LLMs in providing emergent cognition and interactive language-oriented control of robots in a natural and social manner. Video: https://youtu.be/A2WLEuiM3-s.

引用

页码：306 / 321

页数：16

共 50 条

[1] Multimodal Interface for Human-Robot Collaboration
Rautiainen, Samu
Pantano, Matteo
Traganos, Konstantinos
Ahmadi, Seyedamir
Saenz, Jose
Mohammed, Wael M.
Lastra, Jose L. Martinez
MACHINES, 2022, 10 (10)
[2] Human-Robot Collaboration Using Industrial Robots
Antonelli, Dario
Bruno, Giulia
PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL, AUTOMATION AND MECHANICAL ENGINEERING (EAME 2017), 2017, 86 : 99 - 102
[3] Natural multimodal communication for human-robot collaboration
Maurtua, Inaki
Fernandez, Izaskun
Tellaeche, Alberto
Kildal, Johan
Susperregi, Loreto
Ibarguren, Aitor
Sierra, Basilio
INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2017, 14 (04): : 1 - 12
[4] A multimodal teleoperation interface for human-robot collaboration
Si, Weiyong
Zhong, Tianjian
Wang, Ning
Yang, Chenguang
2023 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS, ICM, 2023,
[5] Safe Multimodal Communication in Human-Robot Collaboration
Ferrari, Davide
Pupa, Andrea
Signoretti, Alberto
Secchi, Cristian
HUMAN-FRIENDLY ROBOTICS 2023, HFR 2023, 2024, 29 : 151 - 163
[6] Symbol Grounding from Natural Conversation for Human-Robot Communication
Thu, Ye Kyaw
Ishida, Takuya
Iwahashi, Naoto
Nakamura, Tomoaki
Nagai, Takayuki
PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON HUMAN AGENT INTERACTION (HAI'17), 2017, : 415 - 419
[7] Timing of Multimodal Robot Behaviors during Human-Robot Collaboration
Jensen, Lars Christian
Fischer, Kerstin
Suvei, Stefan-Daniel
Bodenhagen, Leon
2017 26TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2017, : 1061 - 1066
[8] A Multimodal Human-Robot Interaction Manager for Assistive Robots
Abbasi, Bahareh
Monaikul, Natawut
Rysbek, Zhanibek
Di Eugenio, Barbara
Zefran, Milos
2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 6756 - 6762
[9] HARMONIC: A multimodal dataset of assistive human-robot collaboration
Newman, Benjamin A.
Aronson, Reuben M.
Srinivasa, Siddhartha S.
Kitani, Kris
Admoni, Henny
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2022, 41 (01): : 3 - 11
[10] Mobile Multimodal Human-Robot Interface for Virtual Collaboration
Song, Young Eun
Niitsuma, Mihoko
Kubota, Takashi
Hashimoto, Hideki
Son, Hyoung Il
3RD IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2012), 2012, : 627 - 631

← 1 2 3 4 5 →