Augmented Commonsense Knowledge for Remote Object Grounding

被引:0
|
作者
Mohammadi, Bahram [1 ]
Hong, Yicong [2 ]
Qi, Yuankai [3 ]
Wu, Qi [1 ]
Pan, Shirui [4 ]
Shi, Javen Qinfeng [1 ]
机构
[1] Univ Adelaide, Australian Inst Machine Learning AIML, Adelaide, SA, Australia
[2] Australian Natl Univ, Canberra, ACT, Australia
[3] Macquarie Univ, Sydney, NSW, Australia
[4] Griffith Univ, Nathan, Qld, Australia
来源
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5 | 2024年
关键词
LANGUAGE;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The vision-and-language navigation (VLN) task necessitates an agent to perceive the surroundings, follow natural language instructions, and act in photo-realistic unseen environments. Most of the existing methods employ the entire image or object features to represent navigable viewpoints. However, these representations are insufficient for proper action prediction, especially for the REVERIE task, which uses concise high-level instructions, such as "Bring me the blue cushion in the master bedroom". To address enhancing representation, we propose an augmented commonsense knowledge model (ACK) to leverage commonsense information as a spatio-temporal knowledge graph for improving agent navigation. Specifically, the proposed approach involves constructing a knowledge base by retrieving commonsense information from ConceptNet, followed by a refinement module to remove noisy and irrelevant knowledge. We further present ACK which consists of knowledge graph-aware crossmodal and concept aggregation modules to enhance visual representation and visual-textual data alignment by integrating visible objects, commonsense knowledge, and concept history, which includes object and knowledge temporal information. Moreover, we add a new pipeline for the commonsense-based decision-making process which leads to more accurate local action prediction. Experimental results demonstrate our proposed model noticeably outperforms the baseline and archives the state-of-the-art on the REVERIE benchmark. The source code is available at https://github.com/BahramMohammadi/ACK.
引用
收藏
页码:4269 / 4277
页数:9
相关论文
共 50 条
  • [21] Commonsense Knowledge Base Completion
    Li, Xiang
    Taheri, Aynaz
    Tu, Lifu
    Gimpel, Kevin
    PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1445 - 1455
  • [22] A Survey of Commonsense Knowledge Acquisition
    臧良俊
    曹聪
    曹亚男
    吴昱明
    曹存根
    JournalofComputerScience&Technology, 2013, 28 (04) : 689 - 719
  • [23] CogNet: Bridging Linguistic Knowledge, World Knowledge and Commonsense Knowledge
    Wang, Chenhao
    Chen, Yubo
    Xue, Zhipeng
    Zhou, Yang
    Zhao, Jun
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 16114 - 16116
  • [24] Active object search in an unknown large-scale environment using commonsense knowledge and spatial relations
    Kim, Mingu
    Suh, Il Hong
    INTELLIGENT SERVICE ROBOTICS, 2019, 12 (04) : 371 - 380
  • [25] Active object search in an unknown large-scale environment using commonsense knowledge and spatial relations
    Mingu Kim
    Il Hong Suh
    Intelligent Service Robotics, 2019, 12 : 371 - 380
  • [26] Commonsense Spatial Knowledge-aware 3-D Human Motion and Object Interaction Prediction
    Lee, Sang Uk
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 3057 - 3063
  • [27] Constructing Uyghur Commonsense Knowledge Base by Knowledge Projection
    Anwar, Azmat
    Li, Xiao
    Yang, Yating
    Wang, Yajuan
    APPLIED SCIENCES-BASEL, 2019, 9 (16):
  • [28] Commonsense-Aware Object Value Graph for Object Goal Navigation
    Yoo, Hwiyeon
    Choi, Yunho
    Park, Jeongho
    Oh, Songhwai
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (05) : 4423 - 4430
  • [29] Visually Grounded Commonsense Knowledge Acquisition
    Yao, Yuan
    Yu, Tianyu
    Zhang, Ao
    Li, Mengdi
    Xie, Ruobing
    Weber, Cornelius
    Liu, Zhiyuan
    Zheng, Hai-Tao
    Wermter, Stefan
    Chua, Tat-Seng
    Sun, Maosong
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 5, 2023, : 6583 - 6592
  • [30] Generated Knowledge Prompting for Commonsense Reasoning
    Liu, Jiacheng
    Liu, Alisa
    Lu, Ximing
    Welleck, Sean
    West, Peter
    Le Bras, Ronan
    Choi, Yejin
    Hajishirzi, Hannaneh
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3154 - 3169