Enabling deep learning for large scale question answering in Italian

被引:2
|
作者
Croce, Danilo [1 ]
Zelenanska, Alexandra [1 ]
Basili, Roberto [1 ]
机构
[1] Univ Roma Tor Vergata, Dept Enterprise Engn, Rome, Italy
关键词
Question answering in Italian; deep learning; recurrent neural network with attention;
D O I
10.3233/IA-190018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The recent breakthroughs in the field of deep learning led to state-of-the-art results in several NLP tasks, such as Question Answering (QA). Unfortunately, the requirements of such neural QA systems are very strict due to the size of the involved training datasets. In cross-linguistic settings these requirements are not satisfied as training datasets for QA over non-English texts are often not available. This represents the major barrier for a wide-spread adoption of neural QA methods in NLP applications. In this paper, the acquisition of a large scale dataset for an open-domain factoid question answering system in Italian is discussed. It is obtained by automatic translation and linguistic elicitation of an existing English dataset, i.e. the SQUAD question-answer pair corpus. Even though the quality of the resulting corpus for Italian might not be completely satisfying, our work allowed to generate more than 60 thousand question-answer pairs. In the paper the impact of this resource on the QA process over the Italian Wikipedia is studied, according to different training conditions and architectural constraints. A comparative evaluation against the English version, in line with standards in the SQUAD literature, is carried out. The outcomes show that the results achievable for Italian are below the state-of-the-art for English, but the ability of learning not to respond (i.e. the adoption of techniques for detecting question whose answers are simply not available, i.e. EMPTY set of answers) allows the system to pursue reasonable levels of precision. This make it already usable within realistic application scenarios. Finally, an error analysis is presented that suggests possible future research directions on still critical but highly beneficial enhancements, in view of concrete QA applications in Italian.
引用
收藏
页码:49 / 61
页数:13
相关论文
共 50 条
  • [1] Neural Learning for Question Answering in Italian
    Croce, Danilo
    Zelenanska, Alexandra
    Basili, Roberto
    AI*IA 2018 - ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, 11298 : 389 - 402
  • [2] Distributed Deep Learning for Question Answering
    Feng, Minwei
    Xiang, Bing
    Zhou, Bowen
    CIKM'16: PROCEEDINGS OF THE 2016 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2016, : 2413 - 2416
  • [3] A Question Answering System Based on Deep Learning
    Liu, Lu
    Luo, Jing
    INTELLIGENT COMPUTING METHODOLOGIES, ICIC 2018, PT III, 2018, 10956 : 173 - 181
  • [4] Combining Deep Learning with Information Retrieval for Question Answering
    Yang, Fengyu
    Gan, Liang
    Li, Aiping
    Huang, Dongchuan
    Chou, Xiaohui
    Liu, Hongmei
    NATURAL LANGUAGE UNDERSTANDING AND INTELLIGENT APPLICATIONS (NLPCC 2016), 2016, 10102 : 917 - 925
  • [5] Deep learning-based question answering: a survey
    Abdel-Nabi, Heba
    Awajan, Arafat
    Ali, Mostafa Z.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (04) : 1399 - 1485
  • [6] A Study of Deep Learning for Factoid Question Answering System
    Day, Min-Yuh
    Kuo, Yu-Ling
    2020 IEEE 21ST INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION FOR DATA SCIENCE (IRI 2020), 2020, : 419 - 424
  • [7] HQADeepHelper: A Deep Learning System for Healthcare Question Answering
    Luo, Feng
    Wang, Xiaoli
    Wu, Qingfeng
    Liang, Jiaying
    Qiu, Xueliang
    Bao, Zhifeng
    WWW'20: COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2020, 2020, : 194 - 197
  • [8] Deep learning based question answering system in Bengali
    Mayeesha, Tasmiah Tahsin
    Sarwar, Abdullah Md
    Rahman, Rashedur M.
    JOURNAL OF INFORMATION AND TELECOMMUNICATION, 2021, 5 (02) : 145 - 178
  • [9] Deep learning-based question answering: a survey
    Heba Abdel-Nabi
    Arafat Awajan
    Mostafa Z. Ali
    Knowledge and Information Systems, 2023, 65 : 1399 - 1485
  • [10] A survey of deep learning-based visual question answering
    Huang, Tong-yuan
    Yang, Yu-ling
    Yang, Xue-jiao
    JOURNAL OF CENTRAL SOUTH UNIVERSITY, 2021, 28 (03) : 728 - 746