When Robots Get Chatty: Grounding Multimodal Human-Robot Conversation and Collaboration

被引:0
|
作者
Allgeuer, Philipp [1 ]
Ali, Hassan [1 ]
Wermter, Stefan [1 ]
机构
[1] Univ Hamburg, Dept Informat, Knowledge Technol, Hamburg, Germany
来源
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT IV | 2024年 / 15019卷
关键词
Natural Dialog for Robots; LLM Grounding; AI-Enabled Robotics; Multimodal Interaction;
D O I
10.1007/978-3-031-72341-4_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We investigate the use of Large Language Models (LLMs) to equip neural robotic agents with human-like social and cognitive competencies, for the purpose of open-ended human-robot conversation and collaboration. We introduce a modular and extensible methodology for grounding an LLM with the sensory perceptions and capabilities of a physical robot, and integrate multiple deep learning models throughout the architecture in a form of system integration. The integrated models encompass various functions such as speech recognition, speech generation, open-vocabulary object detection, human pose estimation, and gesture detection, with the LLM serving as the central text-based coordinating unit. The qualitative and quantitative results demonstrate the huge potential of LLMs in providing emergent cognition and interactive language-oriented control of robots in a natural and social manner. Video: https://youtu.be/A2WLEuiM3-s.
引用
收藏
页码:306 / 321
页数:16
相关论文
共 50 条
  • [41] Enhancing Robot Explainability in Human-Robot Collaboration
    Wang, Yanting
    You, Sangseok
    HUMAN-COMPUTER INTERACTION, HCI 2023, PT III, 2023, 14013 : 236 - 247
  • [42] Effects of Robot Motion on Human-Robot Collaboration
    Dragan, Anca D.
    Bauman, Shira
    Forlizzi, Jodi
    Srinivasa, Siddhartha S.
    PROCEEDINGS OF THE 2015 ACM/IEEE INTERNATIONAL CONFERENCE ON HUMAN-ROBOT INTERACTION (HRI'15), 2015, : 51 - 58
  • [43] Multimodal corpus of bidirectional conversation of human-human and human-robot interaction during fMRI scanning
    Rauchbauer, Birgit
    Hmamouche, Youssef
    Bigi, Brigitte
    Prevot, Laurent
    Ochs, Magalie
    Chaminade, Thierry
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 668 - 675
  • [44] Grounding Spatial Relations for Human-Robot Interaction
    Guadarrama, Sergio
    Riano, Lorenzo
    Golland, Dave
    Goehring, Daniel
    Jia, Yangqing
    Klein, Dan
    Abbeel, Pieter
    Darrell, Trevor
    2013 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2013, : 1640 - 1647
  • [45] A Human-Robot Collaboration Framework Based on Human Collaboration Demonstration and Robot Learning
    Peng, Xiang
    Jiang, Jingang
    Xia, Zeyang
    Xiong, Jing
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2024, PT VII, 2025, 15207 : 286 - 299
  • [46] An End-to-End Human Simulator for Task-Oriented Multimodal Human-Robot Collaboration
    Shervedani, Afagh Mehri
    Li, Siyu
    Monaikul, Natawut
    Abbasi, Bahareh
    Di Eugenio, Barbara
    Zefran, Milos
    2023 32ND IEEE INTERNATIONAL CONFERENCE ON ROBOT AND HUMAN INTERACTIVE COMMUNICATION, RO-MAN, 2023, : 614 - 620
  • [47] Weakly-Supervised Learning for Multimodal Human Activity Recognition in Human-Robot Collaboration Scenarios
    Pohlt, Clemens
    Schlegl, Thomas
    Wachsmuth, Sven
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 8381 - 8386
  • [48] Building a multimodal human-robot interface
    Perzanowski, D
    Schultz, AC
    Adams, W
    Marsh, E
    Bugajska, M
    IEEE INTELLIGENT SYSTEMS & THEIR APPLICATIONS, 2001, 16 (01): : 16 - 21
  • [49] A Multimodal Emotional Human-Robot Interaction Architecture for Social Robots Engaged in Bidirectional Communication
    Hong, Alexander
    Lunscher, Nolan
    Hu, Tianhao
    Tsuboi, Yuma
    Zhang, Xinyi
    Alves, Silas Franco dos Reis
    Nejat, Goldie
    Benhabib, Beno
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (12) : 5954 - 5968
  • [50] Evaluating Fluency in Human-Robot Collaboration
    Hoffman, Guy
    IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2019, 49 (03) : 209 - 218