Benchmarking question answering systems

被引：16

作者：

Usbeck, Ricardo ^{[1
]}

Roeder, Michael ^{[1
]}

Hoffmann, Michael ^{[3
]}

Conrads, Felix ^{[1
]}

Huthmann, Jonathan ^{[3
]}

Ngonga-Ngomo, Axel-Cyrille ^{[1
]}

Demmler, Christian ^{[3
]}

Unger, Christina ^{[2
]}

机构：

[1] Paderborn Univ, DICE Data Sci Grp, Paderborn, Germany

[2] Univ Bielefeld, CITEC, Bielefeld, Germany

[3] Univ Leipzig, AKSW Grp, Leipzig, Germany

来源：

SEMANTIC WEB | 2019年 / 10卷 / 02期

基金：

欧盟地平线“2020”;

关键词：

Factoid question answering; benchmarking; repeatable open research;

D O I：

10.3233/SW-180312

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The necessity of making the Semantic Web more accessible for lay users, alongside the uptake of interactive systems and smart assistants for the Web, have spawned a new generation of RDF-based question answering systems. However, fair evaluation of these systems remains a challenge due to the different type of answers that they provide. Hence, repeating current published experiments or even benchmarking on the same datasets remains a complex and time-consuming task. We present a novel online benchmarking platform for question answering (QA) that relies on the FAIR principles to support the fine-grained evaluation of question answering systems. We detail how the platform addresses the fair benchmarking platform of question answering systems through the rewriting of URIs and URLs. In addition, we implement different evaluation metrics, measures, datasets and pre-implemented systems as well as methods to work with novel formats for interactive and non-interactive benchmarking of question answering systems. Our analysis of current frameworks shows that most of the current frameworks are tailored towards particular datasets and challenges but do not provide generic models. In addition, while most frameworks perform well in the annotation of entities and properties, the generation of SPARQL queries from annotated text remains a challenge.

引用

页码：293 / 304

页数：12

共 50 条

[21] Benchmarking Geospatial Question Answering Engines Using the Dataset GEOQUESTIONS1089
Kefalidis, Sergios-Anestis
Punjani, Dharmen
Tsalapati, Eleni
Plas, Konstantinos
Pollali, Mariangela
Mitsios, Michail
Tsokanaridou, Myrto
Koubarakis, Manolis
Maret, Pierre
SEMANTIC WEB, ISWC 2023, PT II, 2023, 14266 : 266 - 284
[22] RobustQA: Benchmarking the Robustness of Domain Adaptation for Open-Domain Question Answering
Han, Rujun
Qi, Peng
Zhang, Yuhao
Liu, Lan
Burger, Juliette
Wang, William Yang
Huang, Zhiheng
Xiang, Bing
Roth, Dan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 4294 - 4311
[23] Measuring Retrieval Complexity in Question Answering Systems
Gabburo, Matteo
Jedema, Nicolaas Paul
Garg, Siddhant
Ribeiro, Leonardo F. R.
Moschitti, Alessandro
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: ACL 2024, 2024, : 14636 - 14650
[24] The measurement of user satisfaction with question answering systems
Ong, Chorng-Shyong
Day, Min-Yuh
Hsu, Wen-Lian
INFORMATION & MANAGEMENT, 2009, 46 (07) : 397 - 403
[25] Question Answering Systems: A Systematic Literature Review
Alanazi, Sarah Saad
Elfadil, Nazar
Jarajreh, Mutsam
Algarni, Saad
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (03) : 495 - 502
[26] Usability survey of biomedical question answering systems
Bauer, Michael A.
Berleant, Daniel
HUMAN GENOMICS, 2012, 6
[27] Question-answering systems in knowledge management
Moldovan, D
IEEE INTELLIGENT SYSTEMS, 2001, 16 (06) : 90 - 92
[28] Usability survey of biomedical question answering systems
Michael A Bauer
Daniel Berleant
Human Genomics, 6
[29] Humor Detection in Product Question Answering Systems
Ziser, Yftah
Kravi, Elad
Carmel, David
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 519 - 528
[30] Preference reasoning in advanced question answering systems
Benamara, Farah
Kaci, Souhila
AI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4304 : 1116 - +

← 1 2 3 4 5 →