Benchmarking question answering systems

被引：16

作者：

Usbeck, Ricardo ^{[1
]}

Roeder, Michael ^{[1
]}

Hoffmann, Michael ^{[3
]}

Conrads, Felix ^{[1
]}

Huthmann, Jonathan ^{[3
]}

Ngonga-Ngomo, Axel-Cyrille ^{[1
]}

Demmler, Christian ^{[3
]}

Unger, Christina ^{[2
]}

机构：

[1] Paderborn Univ, DICE Data Sci Grp, Paderborn, Germany

[2] Univ Bielefeld, CITEC, Bielefeld, Germany

[3] Univ Leipzig, AKSW Grp, Leipzig, Germany

来源：

SEMANTIC WEB | 2019年 / 10卷 / 02期

基金：

欧盟地平线“2020”;

关键词：

Factoid question answering; benchmarking; repeatable open research;

D O I：

10.3233/SW-180312

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The necessity of making the Semantic Web more accessible for lay users, alongside the uptake of interactive systems and smart assistants for the Web, have spawned a new generation of RDF-based question answering systems. However, fair evaluation of these systems remains a challenge due to the different type of answers that they provide. Hence, repeating current published experiments or even benchmarking on the same datasets remains a complex and time-consuming task. We present a novel online benchmarking platform for question answering (QA) that relies on the FAIR principles to support the fine-grained evaluation of question answering systems. We detail how the platform addresses the fair benchmarking platform of question answering systems through the rewriting of URIs and URLs. In addition, we implement different evaluation metrics, measures, datasets and pre-implemented systems as well as methods to work with novel formats for interactive and non-interactive benchmarking of question answering systems. Our analysis of current frameworks shows that most of the current frameworks are tailored towards particular datasets and challenges but do not provide generic models. In addition, while most frameworks perform well in the annotation of entities and properties, the generation of SPARQL queries from annotated text remains a challenge.

引用

页码：293 / 304

页数：12

共 50 条

[41] Evaluating Multilingual Question Answering Systems at CLEF
Forner, Pamela
Giampiccolo, Danilo
Magnini, Bernardo
Penas, Anselmo
Rodrigo, Alvaro
Sutcliffe, Richard
LREC 2010 - SEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2010, : 2774 - 2781
[42] Interactive Question Answering Systems: Literature Review
Biancofiore, Giovanni Maria
Deldjoo, Yashar
Di Noia, Tommaso
Di Sciascio, Eugenio
Narducci, Fedelucio
ACM COMPUTING SURVEYS, 2024, 56 (09)
[43] Arabic Question Answering Systems: Gap Analysis
Biltawi, Mariam M.
Tedmori, Sara
Awajan, Arafat
IEEE ACCESS, 2021, 9 : 63876 - 63904
[44] Development of an evaluation model for Question Answering Systems
Ong, Chorng-Shyong
Day, Min-Yuh
Hsu, Wen-Lian
PROCEEDINGS OF THE 2008 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2008, : 178 - +
[45] Exploiting Opinion Influence in Question Answering Systems
Cercel, Dumitru-Clementin
Onose, Cristian
Trausan-Matu, Stefan
Pop, Florin
2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017), 2017, : 197 - 201
[46] A survey of consumer health question answering systems
Welivita, Anuradha
Pu, Pearl
AI MAGAZINE, 2023, 44 (04) : 482 - 507
[47] A Quantitative Evaluation of Natural Language Question Interpretation for Question Answering Systems
Asakura, Takuto
Kim, Jin-Dong
Yamamoto, Yasunori
Tateisi, Yuka
Takagi, Toshihisa
SEMANTIC TECHNOLOGY (JIST 2018), 2018, 11341 : 215 - 231
[48] A Hybrid Approach for Question Classification in Persian Automatic Question Answering Systems
Sherkat, Ehsan
Farhoodi, Mojgan
2014 4TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2014, : 279 - 284
[49] Multilingual Question Answering Systems: Question Classification in Spanish based in Learning
Garcia Cumbreras, Miguel Angel
Martinez Santiago, Fernando
Alfonso Urena Lopez, L.
Montejo Raez, Arturo
PROCESAMIENTO DEL LENGUAJE NATURAL, 2005, (34):
[50] Benchmarking Answer Verification Methods for Question Answering-Based Summarization Evaluation Metrics
Deutsch, Daniel
Roth, Dan
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3759 - 3765

← 1 2 3 4 5 →