Beyond IID: Three Levels of Generalization for Question Answering on Knowledge Bases

被引:49
|
作者
Gu, Yu [1 ]
Kase, Sue [2 ]
Vanni, Michelle T. [2 ]
Sadler, Brian M. [2 ]
Liang, Percy [3 ]
Yan, Xifeng [4 ]
Su, Yu [1 ]
机构
[1] Ohio State Univ, Columbus, OH 43210 USA
[2] US Army Res Lab, Aberdeen Proving Ground, MD USA
[3] Stanford Univ, Stanford, CA 94305 USA
[4] Univ Calif Santa Barbara, Santa Barbara, CA 93106 USA
基金
美国国家科学基金会;
关键词
Knowledge Base; Question Answering; Semantic Parsing;
D O I
10.1145/3442381.3449992
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing studies on question answering on knowledge bases (KBQA) mainly operate with the standard i.i.d. assumption, i.e., training distribution over questions is the same as the test distribution. However, i.i.d. may be neither achievable nor desirable on large-scale KBs because 1) true user distribution is hard to capture and 2) randomly sampling training examples from the enormous space would be data-inefficient. Instead, we suggest that KBQA models should have three levels of built-in generalization: i.i.d., compositional, and zero-shot. To facilitate the development of KBQA models with stronger generalization, we construct and release a new large-scale, high-quality dataset with 64,331 questions, GRAILQA, and provide evaluation settings for all three levels of generalization. In addition, we propose a novel BERT-based KBQA model. The combination of our dataset and model enables us to thoroughly examine and demonstrate, for the first time, the key role of pre-trained contextual embeddings like BERT in the generalization of KBQA.(1)
引用
收藏
页码:3477 / 3488
页数:12
相关论文
共 50 条
  • [21] EDG-Based Question Decomposition for Complex Question Answering over Knowledge Bases
    Hu, Xixin
    Shu, Yiheng
    Huang, Xiang
    Qu, Yuzhong
    [J]. SEMANTIC WEB - ISWC 2021, 2021, 12922 : 128 - 145
  • [22] Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases
    Chen, Yu
    Wu, Lingfei
    Zaki, Mohammed J.
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2913 - 2923
  • [23] A template-based approach for question answering over knowledge bases
    Formica, Anna
    Mele, Ida
    Taglino, Francesco
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (01) : 453 - 479
  • [24] A template-based approach for question answering over knowledge bases
    Anna Formica
    Ida Mele
    Francesco Taglino
    [J]. Knowledge and Information Systems, 2024, 66 : 453 - 479
  • [25] WDAqua-corel: A Question Answering service for RDF Knowledge Bases
    Diefenbach, Dennis
    Singh, Kamal
    Maret, Pierre
    [J]. COMPANION PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2018 (WWW 2018), 2018, : 1087 - 1091
  • [26] Core techniques of question answering systems over knowledge bases: a survey
    Diefenbach, Dennis
    Lopez, Vanessa
    Singh, Kamal
    Maret, Pierre
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2018, 55 (03) : 529 - 569
  • [27] KBQA: Learning Question Answering over QA Corpora and Knowledge Bases
    Cui, Wanyun
    Xiao, Yanghua
    Wang, Haixun
    Song, Yangqiu
    Hwang, Seung-won
    Wang, Wei
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2017, 10 (05): : 565 - 576
  • [28] Question Answering in Knowledge Bases: A Verification Assisted Model with Iterative Training
    Zhang, Richong
    Wang, Yue
    Mao, Yongyi
    Huai, Jinpeng
    [J]. ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2019, 37 (04)
  • [29] Core techniques of question answering systems over knowledge bases: a survey
    Dennis Diefenbach
    Vanessa Lopez
    Kamal Singh
    Pierre Maret
    [J]. Knowledge and Information Systems, 2018, 55 : 529 - 569
  • [30] Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text
    Sun, Haitian
    Dhingra, Bhuwan
    Zaheer, Manzil
    Mazaitis, Kathryn
    Salakhutdinov, Ruslan
    Cohen, William W.
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 4231 - 4242