StepGame: A New Benchmark for Robust Multi-Hop Spatial Reasoning in Texts

被引:0
|
作者
Shi, Zhengxiang [1 ]
Zhang, Qiang [2 ]
Lipani, Aldo [1 ]
机构
[1] UCL, London, England
[2] Zhejiang Univ, Hangzhou, Zhejiang, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Inferring spatial relations in natural language is a crucial ability an intelligent system should possess. The bAbI dataset tries to capture tasks relevant to this domain (task 17 and 19). However, these tasks have several limitations. Most importantly, they are limited to fixed expressions, they are limited in the number of reasoning steps required to solve them, and they fail to test the robustness of models to input that contains irrelevant or redundant information. In this paper, we present a new Question-Answering dataset called StepGame for robust multi-hop spatial reasoning in texts. Our experiments demonstrate that state-of-the-art models on the bAbI dataset struggle on the StepGame dataset. Moreover, we propose a Tensor-Product based Memory-Augmented Neural Network (TP-MANN) specialized for spatial reasoning tasks. Experimental results on both datasets show that our model outperforms all the baselines with superior generalization and robustness performance.
引用
收藏
页码:11321 / 11329
页数:9
相关论文
共 50 条
  • [1] Disentangling Extraction and Reasoning in Multi-hop Spatial Reasoning
    Mirzaee, Roshanak
    Kordjamshidi, Parisa
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 3379 - 3397
  • [2] Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval
    Khattab, Omar
    Potts, Christopher
    Zaharia, Matei
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Visual Reasoning with Multi-hop Feature Modulation
    Strub, Florian
    Seurin, Mathieu
    Perez, Ethan
    de Vries, Harm
    Mary, Jeremie
    Preux, Philippe
    Courville, Aaron
    Pietquin, Olivier
    COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 : 808 - 831
  • [4] GaussianPath:A Bayesian Multi-Hop Reasoning Framework for Knowledge Graph Reasoning
    Wan, Guojia
    Du, Bo
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 4393 - 4401
  • [5] Is Multi-Hop Reasoning Really Explainable? Towards Benchmarking Reasoning Interpretability
    Lv, Xin
    Cao, Yixin
    Hou, Lei
    Li, Juanzi
    Liu, Zhiyuan
    Zhang, Yichi
    Dai, Zelin
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 8899 - 8911
  • [6] Improved Multi-hop Reasoning Through Sampling and Aggregating
    Luo, Mengyu
    Chen, Jianxia
    Yan, Qi
    Jiang, Gaohang
    Dong, Shi
    Xiao, Liang
    Huang, Zhongwei
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING-ICANN 2024, PT I, 2024, 15016 : 131 - 146
  • [7] Dynamically Fused Graph Network for Multi-hop Reasoning
    Qiu, Lin
    Xiao, Yunxuan
    Qu, Yanru
    Zhou, Hao
    Li, Lei
    Zhang, Weinan
    Yu, Yong
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 6140 - 6150
  • [8] Multi-Hop Knowledge Graph Reasoning with Reward Shaping
    Lin, Xi Victoria
    Socher, Richard
    Xiong, Caiming
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 3243 - 3253
  • [9] Multi-Hop Reasoning for Question Answering with Knowledge Graph
    Zhang, Jiayuan
    Cai, Yifei
    Zhang, Qian
    Cao, Zehao
    Cheng, Zhenrong
    Li, Dongmei
    Meng, Xianghao
    2021 IEEE/ACIS 20TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS 2021-SUMMER), 2021, : 121 - 125
  • [10] Understanding Dataset Design Choices for Multi-hop Reasoning
    Chen, Jifan
    Durrett, Greg
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 4026 - 4032