Translation Artifacts in Cross-lingual Transfer Learning

被引:0
|
作者
Artetxe, Mikel [1 ]
Labaka, Gorka [1 ]
Agirre, Eneko [1 ]
机构
[1] Univ Basque Country UPV EHU, HiTZ Ctr, Bilbao, Spain
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Both human and machine translation play a central role in cross-lingual transfer learning: many multilingual datasets have been created through professional translation services, and using machine translation to translate either the test set or the training set is a widely used transfer technique. In this paper, we show that such translation process can introduce subtle artifacts that have a notable impact in existing cross-lingual models. For instance, in natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them, which current models are highly sensitive to. We show that some previous findings in cross-lingual transfer learning need to be reconsidered in the light of this phenomenon. Based on the gained insights, we also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
引用
收藏
页码:7674 / 7684
页数:11
相关论文
共 50 条
  • [1] Exploring Cross-Lingual Transfer Learning with Unsupervised Machine Translation
    Wang, Chao
    Gaspers, Judith
    Do, Quynh
    Jiang, Hui
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 2011 - 2020
  • [2] Choosing Transfer Languages for Cross-Lingual Learning
    Lin, Yu-Hsiang
    Chen, Chian-Yu
    Lee, Jean
    Li, Zirui
    Zhang, Yuyan
    Xia, Mengzhou
    Rijhwani, Shruti
    He, Junxian
    Zhang, Zhisong
    Ma, Xuezhe
    Anastasopoulos, Antonios
    Littell, Patrick
    Neubig, Graham
    Anastasopoulos, Antonios
    Littell, Patrick
    Neubig, Graham
    [J]. 57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3125 - 3135
  • [3] Learning Better Name Translation for Cross-Lingual Wikification
    Tsai, Chen-Tse
    Roth, Dan
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 5528 - 5536
  • [4] Cross-Lingual Transfer Learning Framework for Program Analysis
    Li, Zhiming
    [J]. 2021 36TH IEEE/ACM INTERNATIONAL CONFERENCE ON AUTOMATED SOFTWARE ENGINEERING ASE 2021, 2021, : 1074 - 1078
  • [5] Cross-Lingual Transfer Learning for Statistical Type Inference
    Li, Zhiming
    Xie, Xiaofei
    Li, Haoliang
    Xu, Zhengzi
    Li, Yi
    Liu, Yang
    [J]. PROCEEDINGS OF THE 31ST ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS, ISSTA 2022, 2022, : 239 - 250
  • [6] CROSS-LINGUAL TRANSFER LEARNING FOR SPOKEN LANGUAGE UNDERSTANDING
    Quynh Ngoc Thi Do
    Gaspers, Judith
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5956 - 5960
  • [7] Cross-Lingual Transfer Learning for Complex Word Identification
    Zaharia, George-Eduard
    Cercel, Dumitru-Clementin
    Dascalu, Mihai
    [J]. 2020 IEEE 32ND INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2020, : 384 - 390
  • [8] Improving Cross-Lingual Transfer Learning for End-to-End Speech Recognition with Speech Translation
    Wang, Changhan
    Pino, Juan
    Gu, Jiatao
    [J]. INTERSPEECH 2020, 2020, : 4731 - 4735
  • [9] Reading Comprehension in Czech via Machine Translation and Cross-Lingual Transfer
    Mackova, Katerina
    Straka, Milan
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2020), 2020, 12284 : 171 - 179
  • [10] Cross-lingual Continual Learning
    M'hamdi, Meryem
    Ren, Xiang
    May, Jonathan
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 3908 - 3943