Interpreting Natural Language Instructions Using Language, Vision, and Behavior

被引:3
|
作者
Benotti, Luciana [1 ,2 ]
Lau, Tessa [3 ]
Villalba, Martin [1 ,4 ]
机构
[1] Univ Nacl Cordoba, Cordoba, Argentina
[2] Consejo Nacl Invest Cient & Tecn, Buenos Aires, DF, Argentina
[3] Savioke Inc, Sunnyvale, CA USA
[4] Univ Potsdam, D-14476 Potsdam, Germany
关键词
Design; Algorithms; Performance; Natural language interpretation; multimodal understanding; action recognition; visual feedback; situated virtual agent; unsupervised learning;
D O I
10.1145/2629632
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We define the problem of automatic instruction interpretation as follows. Given a natural language instruction, can we automatically predict what an instruction follower, such as a robot, should do in the environment to follow that instruction? Previous approaches to automatic instruction interpretation have required either extensive domain-dependent rule writing or extensive manually annotated corpora. This article presents a novel approach that leverages a large amount of unannotated, easy-to-collect data from humans interacting in a game-like environment. Our approach uses an automatic annotation phase based on artificial intelligence planning, for which two different annotation strategies are compared: one based on behavioral information and the other based on visibility information. The resulting annotations are used as training data for different automatic classifiers. This algorithm is based on the intuition that the problem of interpreting a situated instruction can be cast as a classification problem of choosing among the actions that are possible in the situation. Classification is done by combining language, vision, and behavior information. Our empirical analysis shows that machine learning classifiers achieve 77% accuracy on this task on available English corpora and 74% on similar German corpora. Finally, the inclusion of human feedback in the interpretation process is shown to boost performance to 92% for the English corpus and 90% for the German corpus.
引用
收藏
页数:22
相关论文
共 50 条
  • [41] Converting natural language route instructions into robot executable procedures
    Lauria, S
    Bugmann, G
    Kyriacou, T
    Bos, J
    Klein, E
    IEEE ROMAN 2002, PROCEEDINGS, 2002, : 223 - 228
  • [42] Robust comprehension of natural language instructions by a domestic service robot
    Kobori, Takahiro
    Nakamura, Tomoaki
    Nakano, Mikio
    Nagai, Takayuki
    Iwahashi, Naoto
    Funakoshi, Kotaro
    Kaneko, Masahide
    ADVANCED ROBOTICS, 2016, 30 (24) : 1530 - 1543
  • [43] 'INSTRUCTIONS IN A GOLDEN LANGUAGE'
    MULFORD, W
    CRITICAL QUARTERLY, 1992, 34 (04) : 49 - 49
  • [44] Analysis of Behavior in Chat Applications using Natural Language Processing
    Shiny, John J.
    Penyameen, K.
    Nissi, Hannah M.
    Js, Harilakshmi
    Hewin, A.
    Thanusha, S.
    2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE COMPUTING AND SMART SYSTEMS, ICSCSS 2024, 2024, : 718 - 725
  • [45] Assisted Behavior Driven Development Using Natural Language Processing
    Soeken, Mathias
    Wille, Robert
    Drechsler, Rolf
    OBJECTS, MODELS, COMPONENTS, PATTERNS, TOOLS 2012, 2012, 7304 : 269 - 287
  • [46] USING NATURAL LANGUAGE PROCESSING TO UNDERSTAND THE ANTECEDENTS OF BEHAVIOR CHANGE
    Carcone, April
    Kotov, Alexander
    Hasan, Mehedi
    Dong, Ming
    Eggly, Susan
    Hartlieb, Kathryn
    Alexander, Gwen
    Lu, Mei
    Naar, Sylvie
    ANNALS OF BEHAVIORAL MEDICINE, 2018, 52 : S422 - S422
  • [47] "When we say no we mean no": Interpreting negation in vision and language
    Giora, Rachel
    Heruti, Vered
    Metuki, Nili
    Fein, Ofer
    JOURNAL OF PRAGMATICS, 2009, 41 (11) : 2222 - 2239
  • [48] Monte Carlo Tree Search for Interpreting Stress in Natural Language
    Swanson, Kyle
    Hsu, Joy
    Suzgun, Mirac
    PROCEEDINGS OF THE SECOND WORKSHOP ON LANGUAGE TECHNOLOGY FOR EQUALITY, DIVERSITY AND INCLUSION (LTEDI 2022), 2022, : 107 - 119
  • [49] Natural Language Explanations of Classifier Behavior
    de Aquino, Rodrigo Monteiro
    Cozman, Fabio Gagliardi
    2019 IEEE SECOND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE), 2019, : 239 - 242
  • [50] Corporate Culture Explained by Mission and Vision Statements Using Natural Language Processing
    Lu, Guang
    Dollfus, Christian
    Schreiber, David
    Wozniak, Thomas
    Rast, Vinzenz
    Fleck, Matthes
    Lipenkova, Janna
    2021 8TH SWISS CONFERENCE ON DATA SCIENCE, SDS, 2021, : 14 - 19