When Robots Get Chatty: Grounding Multimodal Human-Robot Conversation and Collaboration

被引:0
|
作者
Allgeuer, Philipp [1 ]
Ali, Hassan [1 ]
Wermter, Stefan [1 ]
机构
[1] Univ Hamburg, Dept Informat, Knowledge Technol, Hamburg, Germany
关键词
Natural Dialog for Robots; LLM Grounding; AI-Enabled Robotics; Multimodal Interaction;
D O I
10.1007/978-3-031-72341-4_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the use of Large Language Models (LLMs) to equip neural robotic agents with human-like social and cognitive competencies, for the purpose of open-ended human-robot conversation and collaboration. We introduce a modular and extensible methodology for grounding an LLM with the sensory perceptions and capabilities of a physical robot, and integrate multiple deep learning models throughout the architecture in a form of system integration. The integrated models encompass various functions such as speech recognition, speech generation, open-vocabulary object detection, human pose estimation, and gesture detection, with the LLM serving as the central text-based coordinating unit. The qualitative and quantitative results demonstrate the huge potential of LLMs in providing emergent cognition and interactive language-oriented control of robots in a natural and social manner. Video: https://youtu.be/A2WLEuiM3-s.
引用
收藏
页码:306 / 321
页数:16
相关论文
共 50 条
  • [1] Multimodal Interface for Human-Robot Collaboration
    Rautiainen, Samu
    Pantano, Matteo
    Traganos, Konstantinos
    Ahmadi, Seyedamir
    Saenz, Jose
    Mohammed, Wael M.
    Lastra, Jose L. Martinez
    MACHINES, 2022, 10 (10)
  • [2] Human-Robot Collaboration Using Industrial Robots
    Antonelli, Dario
    Bruno, Giulia
    PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON ELECTRICAL, AUTOMATION AND MECHANICAL ENGINEERING (EAME 2017), 2017, 86 : 99 - 102
  • [3] Natural multimodal communication for human-robot collaboration
    Maurtua, Inaki
    Fernandez, Izaskun
    Tellaeche, Alberto
    Kildal, Johan
    Susperregi, Loreto
    Ibarguren, Aitor
    Sierra, Basilio
    INTERNATIONAL JOURNAL OF ADVANCED ROBOTIC SYSTEMS, 2017, 14 (04): : 1 - 12
  • [4] A multimodal teleoperation interface for human-robot collaboration
    Si, Weiyong
    Zhong, Tianjian
    Wang, Ning
    Yang, Chenguang
    2023 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS, ICM, 2023,
  • [5] Safe Multimodal Communication in Human-Robot Collaboration
    Ferrari, Davide
    Pupa, Andrea
    Signoretti, Alberto
    Secchi, Cristian
    HUMAN-FRIENDLY ROBOTICS 2023, HFR 2023, 2024, 29 : 151 - 163
  • [6] Symbol Grounding from Natural Conversation for Human-Robot Communication
    Thu, Ye Kyaw
    Ishida, Takuya
    Iwahashi, Naoto
    Nakamura, Tomoaki
    Nagai, Takayuki
    PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON HUMAN AGENT INTERACTION (HAI'17), 2017, : 415 - 419
  • [7] Timing of Multimodal Robot Behaviors during Human-Robot Collaboration
    Jensen, Lars Christian
    Fischer, Kerstin
    Suvei, Stefan-Daniel
    Bodenhagen, Leon
    2017 26TH IEEE INTERNATIONAL SYMPOSIUM ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION (RO-MAN), 2017, : 1061 - 1066
  • [8] A Multimodal Human-Robot Interaction Manager for Assistive Robots
    Abbasi, Bahareh
    Monaikul, Natawut
    Rysbek, Zhanibek
    Di Eugenio, Barbara
    Zefran, Milos
    2019 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2019, : 6756 - 6762
  • [9] HARMONIC: A multimodal dataset of assistive human-robot collaboration
    Newman, Benjamin A.
    Aronson, Reuben M.
    Srinivasa, Siddhartha S.
    Kitani, Kris
    Admoni, Henny
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2022, 41 (01): : 3 - 11
  • [10] Mobile Multimodal Human-Robot Interface for Virtual Collaboration
    Song, Young Eun
    Niitsuma, Mihoko
    Kubota, Takashi
    Hashimoto, Hideki
    Son, Hyoung Il
    3RD IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM 2012), 2012, : 627 - 631