Benchmarking question answering systems

被引:16
|
作者
Usbeck, Ricardo [1 ]
Roeder, Michael [1 ]
Hoffmann, Michael [3 ]
Conrads, Felix [1 ]
Huthmann, Jonathan [3 ]
Ngonga-Ngomo, Axel-Cyrille [1 ]
Demmler, Christian [3 ]
Unger, Christina [2 ]
机构
[1] Paderborn Univ, DICE Data Sci Grp, Paderborn, Germany
[2] Univ Bielefeld, CITEC, Bielefeld, Germany
[3] Univ Leipzig, AKSW Grp, Leipzig, Germany
基金
欧盟地平线“2020”;
关键词
Factoid question answering; benchmarking; repeatable open research;
D O I
10.3233/SW-180312
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The necessity of making the Semantic Web more accessible for lay users, alongside the uptake of interactive systems and smart assistants for the Web, have spawned a new generation of RDF-based question answering systems. However, fair evaluation of these systems remains a challenge due to the different type of answers that they provide. Hence, repeating current published experiments or even benchmarking on the same datasets remains a complex and time-consuming task. We present a novel online benchmarking platform for question answering (QA) that relies on the FAIR principles to support the fine-grained evaluation of question answering systems. We detail how the platform addresses the fair benchmarking platform of question answering systems through the rewriting of URIs and URLs. In addition, we implement different evaluation metrics, measures, datasets and pre-implemented systems as well as methods to work with novel formats for interactive and non-interactive benchmarking of question answering systems. Our analysis of current frameworks shows that most of the current frameworks are tailored towards particular datasets and challenges but do not provide generic models. In addition, while most frameworks perform well in the annotation of entities and properties, the generation of SPARQL queries from annotated text remains a challenge.
引用
收藏
页码:293 / 304
页数:12
相关论文
共 50 条
  • [21] Benchmarking Geospatial Question Answering Engines Using the Dataset GEOQUESTIONS1089
    Kefalidis, Sergios-Anestis
    Punjani, Dharmen
    Tsalapati, Eleni
    Plas, Konstantinos
    Pollali, Mariangela
    Mitsios, Michail
    Tsokanaridou, Myrto
    Koubarakis, Manolis
    Maret, Pierre
    SEMANTIC WEB, ISWC 2023, PT II, 2023, 14266 : 266 - 284
  • [22] RobustQA: Benchmarking the Robustness of Domain Adaptation for Open-Domain Question Answering
    Han, Rujun
    Qi, Peng
    Zhang, Yuhao
    Liu, Lan
    Burger, Juliette
    Wang, William Yang
    Huang, Zhiheng
    Xiang, Bing
    Roth, Dan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 4294 - 4311
  • [23] Measuring Retrieval Complexity in Question Answering Systems
    Gabburo, Matteo
    Jedema, Nicolaas Paul
    Garg, Siddhant
    Ribeiro, Leonardo F. R.
    Moschitti, Alessandro
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 14636 - 14650
  • [24] The measurement of user satisfaction with question answering systems
    Ong, Chorng-Shyong
    Day, Min-Yuh
    Hsu, Wen-Lian
    INFORMATION & MANAGEMENT, 2009, 46 (07) : 397 - 403
  • [25] Question Answering Systems: A Systematic Literature Review
    Alanazi, Sarah Saad
    Elfadil, Nazar
    Jarajreh, Mutsam
    Algarni, Saad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 495 - 502
  • [26] Usability survey of biomedical question answering systems
    Bauer, Michael A.
    Berleant, Daniel
    HUMAN GENOMICS, 2012, 6
  • [27] Question-answering systems in knowledge management
    Moldovan, D
    IEEE INTELLIGENT SYSTEMS, 2001, 16 (06) : 90 - 92
  • [28] Usability survey of biomedical question answering systems
    Michael A Bauer
    Daniel Berleant
    Human Genomics, 6
  • [29] Humor Detection in Product Question Answering Systems
    Ziser, Yftah
    Kravi, Elad
    Carmel, David
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 519 - 528
  • [30] Preference reasoning in advanced question answering systems
    Benamara, Farah
    Kaci, Souhila
    AI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4304 : 1116 - +