Testing Your Question Answering Software via Asking Recursively

被引：13

作者：

Chen, Songqiang ^{[1
]}

Jin, Shuo ^{[1
]}

Xie, Xiaoyuan ^{[1
]}

机构：

[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China

来源：

2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021 | 2021年

基金：

中国国家自然科学基金; 国家重点研发计划;

关键词：

question answering; testing and validation; recursive metamorphic testing; natural language processing;

D O I：

10.1109/ASE51524.2021.9678670

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Question Answering (QA) is an attractive and challenging area in NLP community. There are diverse algorithms being proposed and various benchmark datasets with different topics and task formats being constructed. QA software has also been widely used in daily human life now. However, current QA software is mainly tested in a reference-based paradigm, in which the expected outputs (labels) of test cases need to be annotated with much human effort before testing. As a result, neither the just-in-time test during usage nor the extensible test on massive unlabeled real-life data is feasible, which keeps the current testing of QA software from being flexible and sufficient. In this paper, we propose a method, QAASKER, with three novel Metamorphic Relations for testing QA software. QAASKER does not require the annotated labels but tests QA software by checking its behaviors on multiple recursively asked questions that are related to the same knowledge. Experimental results show that QAAsKER can reveal violations at over 80% of valid cases without using any pre-annotated labels. Diverse answering issues, especially the limited generalization on question types across datasets, are revealed on a state-of-the-art QA algorithm.

引用

页码：104 / 116

页数：13

共 50 条

[31] A Question Answering Software for Assessing AI Policies of OECD Countries
Mavrogiorgos, Konstantinos
Kiourtis, Athanasios
Mavrogiorgou, Argyro
Manias, Georgios
Kyriazis, Dimosthenis
[J]. PROCEEDINGS OF THE 4TH EUROPEAN SYMPOSIUM ON SOFTWARE ENGINEERING, ESSE 2023, 2024, : 31 - 36
[32] ExpRec: Deep knowledge-awared question routing in software question answering community
Jiahui Liu
Ansheng Deng
Xinqiang Xie
Qiuju Xie
[J]. Applied Intelligence, 2023, 53 : 5681 - 5696
[33] ExpRec: Deep knowledge-awared question routing in software question answering community
Liu, Jiahui
Deng, Ansheng
Xie, Xinqiang
Xie, Qiuju
[J]. APPLIED INTELLIGENCE, 2023, 53 (05) : 5681 - 5696
[34] Will this Question be Answered? Question Filtering via Answer Model Distillation for Efficient Question Answering
Garg, Siddhant
Moschitti, Alessandro
[J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 7329 - 7346
[35] How can the New Testament writings be within Judaism? Distinguishing ways of asking and answering the question
Van Maaren, John
[J]. ZEITSCHRIFT FUR DIE NEUTESTAMENTLICHE WISSENSCHAFT UND DIE KUNDE DER ALTEREN KIRCHE, 2023, 114 (02): : 264 - 303
[36] More Bang for Your Buck: Natural Perturbation for Robust Question Answering
Khashabi, Daniel
Khot, Tushar
Sabharwal, Ashish
[J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 163 - 170
[37] Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Malinowski, Mateusz
Rohrbach, Marcus
Fritz, Mario
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2017, 125 (1-3) : 110 - 135
[38] Modular Visual Question Answering via Code Generation
Subramanian, Sanjay
Narasimhan, Medhini
Khangaonkar, Kushal
Yang, Kevin
Nagrani, Arsha
Schmid, Cordelia
Zeng, Andy
Darrell, Trevor
Klein, Dan
[J]. 61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 747 - 761
[39] Ask me in your own words: paraphrasing for multitask question answering
Hudson, G. Thomas
Al Moubayed, Noura
[J]. PEERJ COMPUTER SCIENCE, 2021, 7
[40] Ask Your Neurons: A Deep Learning Approach to Visual Question Answering
Mateusz Malinowski
Marcus Rohrbach
Mario Fritz
[J]. International Journal of Computer Vision, 2017, 125 : 110 - 135

← 1 2 3 4 5 →