Augmented Commonsense Knowledge for Remote Object Grounding

被引：0

作者：

Mohammadi, Bahram ^{[1
]}

Hong, Yicong ^{[2
]}

Qi, Yuankai ^{[3
]}

Wu, Qi ^{[1
]}

Pan, Shirui ^{[4
]}

Shi, Javen Qinfeng ^{[1
]}

机构：

[1] Univ Adelaide, Australian Inst Machine Learning AIML, Adelaide, SA, Australia

[2] Australian Natl Univ, Canberra, ACT, Australia

[3] Macquarie Univ, Sydney, NSW, Australia

[4] Griffith Univ, Nathan, Qld, Australia

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5 | 2024年

关键词：

LANGUAGE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The vision-and-language navigation (VLN) task necessitates an agent to perceive the surroundings, follow natural language instructions, and act in photo-realistic unseen environments. Most of the existing methods employ the entire image or object features to represent navigable viewpoints. However, these representations are insufficient for proper action prediction, especially for the REVERIE task, which uses concise high-level instructions, such as "Bring me the blue cushion in the master bedroom". To address enhancing representation, we propose an augmented commonsense knowledge model (ACK) to leverage commonsense information as a spatio-temporal knowledge graph for improving agent navigation. Specifically, the proposed approach involves constructing a knowledge base by retrieving commonsense information from ConceptNet, followed by a refinement module to remove noisy and irrelevant knowledge. We further present ACK which consists of knowledge graph-aware crossmodal and concept aggregation modules to enhance visual representation and visual-textual data alignment by integrating visible objects, commonsense knowledge, and concept history, which includes object and knowledge temporal information. Moreover, we add a new pipeline for the commonsense-based decision-making process which leads to more accurate local action prediction. Experimental results demonstrate our proposed model noticeably outperforms the baseline and archives the state-of-the-art on the REVERIE benchmark. The source code is available at https://github.com/BahramMohammadi/ACK.

引用

页码：4269 / 4277

页数：9

共 50 条

[21] Commonsense Knowledge Base Completion
Li, Xiang
Taheri, Aynaz
Tu, Lifu
Gimpel, Kevin
PROCEEDINGS OF THE 54TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1, 2016, : 1445 - 1455
[22] A Survey of Commonsense Knowledge Acquisition
臧良俊
曹聪
曹亚男
吴昱明
曹存根
JournalofComputerScience&Technology, 2013, 28 (04) : 689 - 719
[23] CogNet: Bridging Linguistic Knowledge, World Knowledge and Commonsense Knowledge
Wang, Chenhao
Chen, Yubo
Xue, Zhipeng
Zhou, Yang
Zhao, Jun
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 16114 - 16116
[24] Active object search in an unknown large-scale environment using commonsense knowledge and spatial relations
Kim, Mingu
Suh, Il Hong
INTELLIGENT SERVICE ROBOTICS, 2019, 12 (04) : 371 - 380
[25] Active object search in an unknown large-scale environment using commonsense knowledge and spatial relations
Mingu Kim
Il Hong Suh
Intelligent Service Robotics, 2019, 12 : 371 - 380
[26] Commonsense Spatial Knowledge-aware 3-D Human Motion and Object Interaction Prediction
Lee, Sang Uk
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 3057 - 3063
[27] Constructing Uyghur Commonsense Knowledge Base by Knowledge Projection
Anwar, Azmat
Li, Xiao
Yang, Yating
Wang, Yajuan
APPLIED SCIENCES-BASEL, 2019, 9 (16):
[28] Commonsense-Aware Object Value Graph for Object Goal Navigation
Yoo, Hwiyeon
Choi, Yunho
Park, Jeongho
Oh, Songhwai
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (05) : 4423 - 4430
[29] Visually Grounded Commonsense Knowledge Acquisition
Yao, Yuan
Yu, Tianyu
Zhang, Ao
Li, Mengdi
Xie, Ruobing
Weber, Cornelius
Liu, Zhiyuan
Zheng, Hai-Tao
Wermter, Stefan
Chua, Tat-Seng
Sun, Maosong
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 5, 2023, : 6583 - 6592
[30] Generated Knowledge Prompting for Commonsense Reasoning
Liu, Jiacheng
Liu, Alisa
Lu, Ximing
Welleck, Sean
West, Peter
Le Bras, Ronan
Choi, Yejin
Hajishirzi, Hannaneh
PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 3154 - 3169

← 1 2 3 4 5 →