On the Multilingual Capabilities of Very Large-Scale English Language Models

被引:0
|
作者
Armengol-Estape, Jordi [1 ]
de Gibert Bonet, Ona [1 ]
Melero, Maite [1 ]
机构
[1] Barcelona Supercomp Ctr, Placa Eusebi Guell 1-3, Barcelona 08034, Spain
关键词
Multilingual; Cross-lingual; Language Modeling;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Generative Pre-trained Transformers (GPTs) have recently been scaled to unprecedented sizes in the history of machine learning. These language models have been shown to exhibit outstanding zero, one, and few-shot learning capabilities in a number of different tasks. Nevertheless, aside from anecdotal experiences, little is known regarding their multilingual capabilities, given the fact that the pre-training corpus is almost entirely composed of English text. In this work, we investigate its potential and limits in three tasks: extractive Question-Answering, text summarization and natural language generation for five different languages, as well as the effect of scale in terms of model size. Our results show that GPT-3 can be used, not only as a powerful generative pre-trained model for English, but for other languages as well, even for some with very few data in the training corpora, with room for improvement if optimization of the tokenization is addressed.
引用
收藏
页码:3056 / 3068
页数:13
相关论文
共 50 条
  • [21] Language-Agnostic and Language-Aware Multilingual Natural Language Understanding for Large-Scale Intelligent Voice Assistant Application
    Zhang, Daniel
    Hueser, Jonathan
    Li, Yao
    Campbell, Sarah
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 1523 - 1532
  • [22] Very Large-Scale Integrated Processor
    Takano, Shigeyuki
    [J]. 2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 821 - 828
  • [23] DESIGNING FOR VERY LARGE-SCALE COMPLEXITY
    OWEN, K
    [J]. INFORMATION AGE, 1983, 5 (03): : 163 - 166
  • [24] ARE SUPERCLUSTERS CORRELATED ON A VERY LARGE-SCALE
    BAHCALL, NA
    BURGETT, WS
    [J]. ASTROPHYSICAL JOURNAL, 1986, 300 (02): : L35 - L39
  • [25] VERY LARGE-SCALE INTEGRATION IN MICROELECTRONICS
    LARDY, JL
    [J]. RECHERCHE, 1980, 11 (116): : 1246 - 1256
  • [26] SYSTEMS FOR VERY LARGE-SCALE COMPUTING
    Jerger, Natalie Enright
    Lipasti, Mikko
    [J]. IEEE MICRO, 2011, 31 (03) : 4 - 6
  • [27] Reconstruction of very large-scale fires
    Saito, K
    [J]. VERY LARGE-SCALE FIRES, 1998, 1336 : 99 - 111
  • [28] Modelling on the very large-scale connectome
    Odor, Geza
    Gastner, Michael T.
    Kelling, Jeffrey
    Deco, Gustavo
    [J]. JOURNAL OF PHYSICS-COMPLEXITY, 2021, 2 (04):
  • [29] Very large-scale arrays of biomolecules
    Don Montgomery
    [J]. Nature Genetics, 1999, 23 (Suppl 3) : 63 - 63
  • [30] VERY LARGE-SCALE INTEGRATION 1983
    不详
    [J]. MICROPROCESSING AND MICROPROGRAMMING, 1984, 13 (02): : 121 - 130