Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering

被引:0
|
作者
Riabi, Arij [1 ,3 ]
Scialom, Thomas [2 ,3 ]
Keraron, Rachel [3 ]
Sagot, Benoit [1 ]
Seddah, Djame [1 ]
Staiano, Jacopo [3 ]
机构
[1] INRIA, Paris, France
[2] Sorbonne Univ, CNRS, LIP6, F-75005 Paris, France
[3] reciTAL, Paris, France
基金
欧盟地平线“2020”;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Coupled with the availability of large scale datasets, deep learning architectures have enabled rapid progress on Question Answering tasks. However, most of those datasets are in English, and the performances of state-of-theart multilingual models are significantly lower when evaluated on non-English data. Due to high data collection costs, it is not realistic to obtain annotated data for each language one desires to support. We propose a method to improve Cross-lingual Question Answering performance without requiring additional annotated data, leveraging Question Generation models to produce synthetic samples in a cross-lingual fashion. We show that the proposed method allows to significantly outperform the baselines trained on English data only, establishing thus a new state-of-the-art on four multilingual datasets: MLQA, XQuAD, SQuAD-it and PIAF (fr).
引用
收藏
页码:7016 / 7030
页数:15
相关论文
共 50 条
  • [1] Improving Zero-Shot Cross-lingual Transfer for Multilingual Question Answering over Knowledge Graph
    Zhou, Yucheng
    Geng, Xiubo
    Shen, Tao
    Zhang, Wenqiang
    Jiang, Daxin
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5822 - 5834
  • [2] Zero-Shot Cross-lingual Semantic Parsing
    Sherborne, Tom
    Lapata, Mirella
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 4134 - 4153
  • [3] Zero-Shot Cross-Lingual Opinion Target Extraction
    Jebbara, Soufian
    Cimiano, Philipp
    [J]. 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 2486 - 2495
  • [4] Zero-Shot Cross-Lingual Neural Headline Generation
    Ayana
    Shen, Shi-qi
    Chen, Yun
    Yang, Cheng
    Liu, Zhi-yuan
    Sun, Mao-song
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (12) : 2319 - 2327
  • [5] XeroAlign: Zero-Shot Cross-lingual Transformer Alignment
    Gritta, Milan
    Iacobacci, Ignacio
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 371 - 381
  • [6] Zero-Shot Cross-Lingual Transfer with Meta Learning
    Nooralahzadeh, Farhad
    Bekoulis, Giannis
    Bjerva, Johannes
    Augenstein, Isabelle
    [J]. PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 4547 - 4562
  • [7] CoSDA-ML: Multi-Lingual Code-Switching Data Augmentation for Zero-Shot Cross-Lingual NLP
    Qin, Libo
    Ni, Minheng
    Zhang, Yue
    Che, Wanxiang
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3853 - 3860
  • [8] Zero-Shot Neural Transfer for Cross-Lingual Entity Linking
    Rijhwani, Shruti
    Xie, Jiateng
    Neubig, Graham
    Carbonell, Jaime
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 6924 - 6931
  • [9] Evaluating morphological typology in zero-shot cross-lingual transfer
    Martinez-Garcia, Antonio
    Badia, Toni
    Barnes, Jeremy
    [J]. 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 3136 - 3153
  • [10] Towards zero-shot cross-lingual named entity disambiguation
    Barrena, Ander
    Soroa, Aitor
    Agirre, Eneko
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 184