Augmented Commonsense Knowledge for Remote Object Grounding

被引：0

作者：

Mohammadi, Bahram ^{[1
]}

Hong, Yicong ^{[2
]}

Qi, Yuankai ^{[3
]}

Wu, Qi ^{[1
]}

Pan, Shirui ^{[4
]}

Shi, Javen Qinfeng ^{[1
]}

机构：

[1] Univ Adelaide, Australian Inst Machine Learning AIML, Adelaide, SA, Australia

[2] Australian Natl Univ, Canberra, ACT, Australia

[3] Macquarie Univ, Sydney, NSW, Australia

[4] Griffith Univ, Nathan, Qld, Australia

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5 | 2024年

关键词：

LANGUAGE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The vision-and-language navigation (VLN) task necessitates an agent to perceive the surroundings, follow natural language instructions, and act in photo-realistic unseen environments. Most of the existing methods employ the entire image or object features to represent navigable viewpoints. However, these representations are insufficient for proper action prediction, especially for the REVERIE task, which uses concise high-level instructions, such as "Bring me the blue cushion in the master bedroom". To address enhancing representation, we propose an augmented commonsense knowledge model (ACK) to leverage commonsense information as a spatio-temporal knowledge graph for improving agent navigation. Specifically, the proposed approach involves constructing a knowledge base by retrieving commonsense information from ConceptNet, followed by a refinement module to remove noisy and irrelevant knowledge. We further present ACK which consists of knowledge graph-aware crossmodal and concept aggregation modules to enhance visual representation and visual-textual data alignment by integrating visible objects, commonsense knowledge, and concept history, which includes object and knowledge temporal information. Moreover, we add a new pipeline for the commonsense-based decision-making process which leads to more accurate local action prediction. Experimental results demonstrate our proposed model noticeably outperforms the baseline and archives the state-of-the-art on the REVERIE benchmark. The source code is available at https://github.com/BahramMohammadi/ACK.

引用

页码：4269 / 4277

页数：9

共 50 条

[31] Capture Commonsense Knowledge for Sentiment Analysis
Zhang, Hongming
Liu, Zhaoyu
Song, Yangqiu
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP), 2018, : 767 - 770
[32] Leveraging Knowledge in Multilingual Commonsense Reasoning
Fang, Yuwei
Wang, Shuohang
Xu, Yichong
Xu, Ruochen
Sun, Siqi
Zhu, Chenguang
Zeng, Michael
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3237 - 3246
[33] Evaluating Commonsense Knowledge with a Computer Game
Mancilla-Caceres, Juan F.
Amir, Eyal
HUMAN-COMPUTER INTERACTION - INTERACT 2011, PT I, 2011, 6946 : 348 - 355
[34] A Hybrid Approach to Commonsense Knowledge Acquisition
Rodosthenous, Christos
Michael, Loizos
PROCEEDINGS OF THE EIGHTH EUROPEAN STARTING AI RESEARCHER SYMPOSIUM (STAIRS 2016), 2016, 284 : 111 - 122
[35] Gradable Adjective Embedding for Commonsense Knowledge
Lee, Kyungjae
Cho, Hyunsouk
Hwang, Seung-won
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2017, PT II, 2017, 10235 : 814 - 827
[36] Advanced Semantics for Commonsense Knowledge Extraction
Nguyen, Tuan-Phong
Razniewski, Simon
Weikum, Gerhard
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, : 2636 - 2647
[37] ON THE REPRESENTATION OF COMMONSENSE KNOWLEDGE BY POSSIBILISTIC REASONING
YAGER, RR
INTERNATIONAL JOURNAL OF MAN-MACHINE STUDIES, 1989, 31 (05): : 587 - 610
[38] Automatic Extraction of Commonsense LocatedNear Knowledge
Xu, Frank F.
Lin, Bill Yuchen
Zhu, Kenny Q.
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 2, 2018, : 96 - 101
[39] REPRESENTATIONS OF COMMONSENSE KNOWLEDGE - DAVIS,E
CROFT, W
ARTIFICIAL INTELLIGENCE, 1993, 61 (01) : 105 - 112
[40] Leveraging Commonsense for Object Localisation in Partial Scenes
Giuliari, Francesco
Skenderi, Geri
Cristani, Marco
Bue, Alessio Del
Wang, Yiming
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (10) : 12038 - 12049

← 1 2 3 4 5 →