Enabling deep learning for large scale question answering in Italian

被引:2
|
作者
Croce, Danilo [1 ]
Zelenanska, Alexandra [1 ]
Basili, Roberto [1 ]
机构
[1] Univ Roma Tor Vergata, Dept Enterprise Engn, Rome, Italy
关键词
Question answering in Italian; deep learning; recurrent neural network with attention;
D O I
10.3233/IA-190018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recent breakthroughs in the field of deep learning led to state-of-the-art results in several NLP tasks, such as Question Answering (QA). Unfortunately, the requirements of such neural QA systems are very strict due to the size of the involved training datasets. In cross-linguistic settings these requirements are not satisfied as training datasets for QA over non-English texts are often not available. This represents the major barrier for a wide-spread adoption of neural QA methods in NLP applications. In this paper, the acquisition of a large scale dataset for an open-domain factoid question answering system in Italian is discussed. It is obtained by automatic translation and linguistic elicitation of an existing English dataset, i.e. the SQUAD question-answer pair corpus. Even though the quality of the resulting corpus for Italian might not be completely satisfying, our work allowed to generate more than 60 thousand question-answer pairs. In the paper the impact of this resource on the QA process over the Italian Wikipedia is studied, according to different training conditions and architectural constraints. A comparative evaluation against the English version, in line with standards in the SQUAD literature, is carried out. The outcomes show that the results achievable for Italian are below the state-of-the-art for English, but the ability of learning not to respond (i.e. the adoption of techniques for detecting question whose answers are simply not available, i.e. EMPTY set of answers) allows the system to pursue reasonable levels of precision. This make it already usable within realistic application scenarios. Finally, an error analysis is presented that suggests possible future research directions on still critical but highly beneficial enhancements, in view of concrete QA applications in Italian.
引用
收藏
页码:49 / 61
页数:13
相关论文
共 50 条
  • [31] A Hybrid Optimized Deep Learning Framework to Enhance Question Answering System
    Kavita Moholkar
    Suhas Patil
    Neural Processing Letters, 2022, 54 : 4711 - 4734
  • [32] FigureNet : A Deep Learning model for Question-Answering on Scientific Plots
    Reddy, Revanth
    Ramesh, Rahul
    Deshpande, Ameet
    Khapra, Mitesh M.
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [33] Improving Deep Learning for Multiple Choice Question Answering with Candidate Contexts
    Nicula, Bogdan
    Ruseti, Stefan
    Rebedea, Traian
    ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018), 2018, 10772 : 678 - 683
  • [34] A Hybrid Optimized Deep Learning Framework to Enhance Question Answering System
    Moholkar, Kavita
    Patil, Suhas
    NEURAL PROCESSING LETTERS, 2022, 54 (06) : 4711 - 4734
  • [35] QUESTION ANSWERING AND LEARNING WITH HYPERTEXT
    ROUET, JF
    LESSONS FROM LEARNING, 1994, 46 : 39 - 52
  • [36] Question answering using a large text database: A machine learning approach
    Ng, HT
    Kwan, JLP
    Xia, Y
    PROCEEDINGS OF THE 2001 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 2001, : 67 - 73
  • [37] Frankenstein: A Platform Enabling Reuse of Question Answering Components
    Singh, Kuldeep
    Both, Andreas
    Sethupat, Arun
    Shekarpour, Saeedeh
    SEMANTIC WEB (ESWC 2018), 2018, 10843 : 624 - 638
  • [38] Joint Learning of Question Answering and Question Generation
    Sun, Yibo
    Tang, Duyu
    Duan, Nan
    Qin, Tao
    Liu, Shujie
    Yan, Zhao
    Zhou, Ming
    Lv, Yuanhua
    Yin, Wenpeng
    Feng, Xiaocheng
    Qin, Bing
    Liu, Ting
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2020, 32 (05) : 971 - 982
  • [39] Deep Question Answering for protein annotation
    Gobeill, Julien
    Gaudinat, Arnaud
    Pasche, Emilie
    Vishnyakova, Dina
    Gaudet, Pascale
    Bairoch, Amos
    Ruch, Patrick
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2015,
  • [40] Dynamic Updating of the Knowledge Base for a Large-Scale Question Answering System
    Liu, Xiao-Yang
    Zhang, Yimeng
    Liao, Yukang
    Jiang, Ling
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (03)