Zero-shot cross-lingual transfer language selection using linguistic similarity

被引:4
|
作者
Eronen, Juuso [1 ]
Ptaszynski, Michal [1 ]
Masui, Fumito [1 ]
机构
[1] Kitami Inst Technol, 165 Koencho, Kitami, Hokkaido 0900015, Japan
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
Multilingual natural language processing; Zero-shot learning; Transfer learning; Linguistics; Language similarity;
D O I
10.1016/j.ipm.2022.103250
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We study the selection of transfer languages for different Natural Language Processing tasks, specifically sentiment analysis, named entity recognition and dependency parsing. In order to select an optimal transfer language, we propose to utilize different linguistic similarity metrics to measure the distance between languages and make the choice of transfer language based on this information instead of relying on intuition. We demonstrate that linguistic similarity correlates with cross-lingual transfer performance for all of the proposed tasks. We also show that there is a statistically significant difference in choosing the optimal language as the transfer source instead of English. This allows us to select a more suitable transfer language which can be used to better leverage knowledge from high-resource languages in order to improve the performance of language applications lacking data. For the study, we used datasets from eight different languages from three language families.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] Cross-Lingual BERT Transformation for Zero-Shot Dependency Parsing
    Wang, Yuxuan
    Che, Wanxiang
    Guo, Jiang
    Liu, Yijia
    Liu, Ting
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 5721 - 5727
  • [32] Simple and Effective Zero-shot Cross-lingual Phoneme Recognition
    Xu, Qiantong
    Baevski, Alexei
    Auli, Michael
    INTERSPEECH 2022, 2022, : 2113 - 2117
  • [33] Zero-Shot Learning for Cross-Lingual News Sentiment Classification
    Pelicon, Andraz
    Pranjic, Marko
    Miljkovic, Dragana
    Skrlj, Blaz
    Pollak, Senja
    APPLIED SCIENCES-BASEL, 2020, 10 (17):
  • [34] Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models
    Huang, Po-Yao
    Patrick, Mandela
    Hu, Junjie
    Neubig, Graham
    Metze, Florian
    Hauptmann, Alexander
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 2443 - 2459
  • [35] A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards
    Dou, Zi-Yi
    Kumar, Sachin
    Tsvetkov, Yulia
    NEURAL GENERATION AND TRANSLATION, 2020, : 60 - 68
  • [36] Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition
    Chatzoudis, Gerasimos
    Plitsis, Manos
    Stamouli, Spyridoula
    Dimou, Athanasia-Lida
    Katsamanis, Nassos
    Katsouros, Vassilis
    INTERSPEECH 2022, 2022, : 2178 - 2182
  • [37] CrossAligner & Co: Zero-Shot Transfer Methods for Task-Oriented Cross-lingual Natural Language Understanding
    Gritta, Milan
    Hu, Ruoyu
    Iacobacci, Ignacio
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 4048 - 4061
  • [38] Towards Making the Most of Cross-Lingual Transfer for Zero-Shot Neural Machine Translation
    Chen, Guanhua
    Ma, Shuming
    Chen, Yun
    Zhang, Dongdong
    Pan, Jia
    Wang, Wenping
    Wei, Furu
    PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 142 - 157
  • [39] BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer
    Parovic, Marinela
    Glavas, Goran
    Vulic, Ivan
    Korhonen, Anna
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 1791 - 1799
  • [40] Zero-Shot Cross-Lingual Transfer of Neural Machine Translation with Multilingual Pretrained Encoders
    Chen, Guanhua
    Ma, Shuming
    Chen, Yun
    Dong, Li
    Zhang, Dongdong
    Pan, Jia
    Wang, Wenping
    Wei, Furu
    2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 15 - 26