MA-MRC: A Multi-answer Machine Reading Comprehension Dataset

被引:0
|
作者
Yue, Zhiang [1 ]
Liu, Jingping [2 ]
Zhang, Cong [3 ]
Wang, Chao [4 ]
Jiang, Haiyun [5 ]
Zhang, Yue [2 ]
Tian, Xianyang [2 ]
Cen, Zhedong [2 ]
Xiao, Yanghua [1 ]
Ruan, Tong [2 ]
机构
[1] Fudan Univ, Shanghai, Peoples R China
[2] East China Univ Sci & Technol, Shanghai, Peoples R China
[3] AECC Sichuan Gas Turbine Estab, Mianyang, Sichuan, Peoples R China
[4] Shanghai Univ, Shanghai, Peoples R China
[5] Tencent AI Lab, Shenzhen, Peoples R China
关键词
Machine Reading Comprehension; Multiple Answer; Knowledge Graph;
D O I
10.1145/3539618.3592015
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Machine reading comprehension (MRC) is an essential task for many question-answering applications. However, existing MRC datasets mainly focus on data with single answer and overlook multiple answers, which are common in the real world. In this paper, we aim to construct an MRC dataset with both data of single answer and multiple answers. To achieve this purpose, we design a novel pipeline method: data collection, data cleaning, question generation and test set annotation. Based on these procedures, we construct a high-quality multi-answer MRC dataset (MA-MRC) with 129K question-answer-context samples. We implement a sequence of baselines and carry out extensive experiments on MA-MRC. According to the experimental results, MA-MRC is a challenging dataset, which can facilitate the future research on the multi-answer MRC task(1).
引用
收藏
页码:2144 / 2148
页数:5
相关论文
共 50 条
  • [1] A Multi-answer Multi-task Framework for Real-world Machine Reading Comprehension
    Liu, Jiahua
    Wei, Wan
    Sun, Maosong
    Chen, Hao
    Du, Yantao
    Lin, Dekang
    [J]. 2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 2109 - 2118
  • [2] Answer Span Correction in Machine Reading Comprehension
    Reddy, Revanth Gangi
    Sultan, Md Arafat
    Kayi, Efsun Sarioglu
    Zhang, Rong
    Castelli, Vittorio
    Sil, Avirup
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020, : 2496 - 2501
  • [3] Stochastic Answer Networks for Machine Reading Comprehension
    Liu, Xiaodong
    Shen, Yelong
    Duh, Kevin
    Gao, Jianfeng
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1694 - 1704
  • [4] Hierarchical Answer Selection Framework for Multi-passage Machine Reading Comprehension
    Li, Zhaohui
    Xu, Jun
    Lan, YanYan
    Guo, Jiafeng
    Feng, Yue
    Cheng, Xueqi
    [J]. INFORMATION RETRIEVAL, CCIR 2018, 2018, 11168 : 93 - 104
  • [5] Selecting Paragraphs to Answer Questions for Multi-passage Machine Reading Comprehension
    Lin, Dengwen
    Tang, Jintao
    Pang, Kunyuan
    Li, Shasha
    Wang, Ting
    [J]. INFORMATION RETRIEVAL (CCIR 2019), 2019, 11772 : 121 - 132
  • [6] BIOMRC: A Dataset for Biomedical Machine Reading Comprehension
    Stavropoulos, Petros
    Pappas, Dimitris
    Androutsopoulos, Ion
    McDonald, Ryan
    [J]. 19TH SIGBIOMED WORKSHOP ON BIOMEDICAL LANGUAGE PROCESSING (BIONLP 2020), 2020, : 140 - 149
  • [7] Development of an Extractive Clinical Question Answering Dataset with Multi-Answer and Multi-Focus Questions
    Moon, Sungrim
    He, Huan
    Liu, Hongfang
    Fan, Jungwei W.
    [J]. arXiv, 2022,
  • [8] I Know There Is No Answer: Modeling Answer Validation for Machine Reading Comprehension
    Tan, Chuanqi
    Wei, Furu
    Zhou, Qingyu
    Yang, Nan
    Lv, Weifeng
    Zhou, Ming
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, PT I, 2018, 11108 : 85 - 97
  • [9] Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification
    Wang, Yizhong
    Liu, Kai
    Liu, Jing
    He, Wei
    Lyu, Yajuan
    Wu, Hua
    Li, Sujian
    Wang, Haifeng
    [J]. PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1, 2018, : 1918 - 1927
  • [10] Dataset for the First Evaluation on Chinese Machine Reading Comprehension
    Cui, Yiming
    Liu, Ting
    Chen, Zhipeng
    Ma, Wentao
    Wang, Shijin
    Hu, Guoping
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 2721 - 2725